Recently, our WordPress backend was experiencing slowdowns. We are using some plugins, yes I know they can contribute to those slowdowns, but unfortunately, those plugins are a requirement for our client. We follow a strict betting process when we select a third-party plugin.
Houston, we have a problem
We have several Cron jobs to execute some custom programs. At random times, we got reports from our content producers that the CMS was slowing down, which was very frustrating working in the CMS. We verify that, indeed, from time to time, the CMS was slow.
How we found the problem
The first step to figuring out the problem was to train our content producers to correctly and helpfully report the problem. Provide us with the time of day, and as much information as they could about what they were experiencing, and let us know as soon as possible. The good old days of just emailing us that the WordPress CMS was slow were over.
The next step was to methodically turn on and off plugins to determine if a plugin was causing the problem. Now turning them on and off did not produce too many useful results. But, turning them off and on, in conjunction with a monitoring tool, did produce the desired results. The are several monitoring tools like the Query Monitor Plugin, which is free. I also rely on a subscription base tool called New Relic.
What we found about the problem
We first found that a custom import cron job that we run every 5, 10, and 20 minutes was very inefficient. We are using the core WordPress table wp_postmeta to store some information about the post. See the WordPress Database Description for more details.
We had over 60,000 posts, which generated over 120,000 child posts(attachments, revisions, etc) and over 1.2 million entries in the wp_postmeta. Because of the way that WordPress joins by default the wp_posts and wp_postmeta, the query was doing a very inefficient join that was taking over 20 seconds to execute, and we were executing that query several times every time the job was executed.
Researching the way the wp_postmeta table is indexed, we learned that it is indexed by post_id and meta_key, but not by meta_value. We were querying for a couple of meta_values, so the query was doing a full table scan every time that the query was executed.
As if that was not bad enough, the problem was compounded by the number of imported old posts that we kept on the table, over 44,000. We decided that for our internal posts, we would keep them forever, but for imported posts, we will keep them for 21 days. After that, they became irrelevant. We looked at Google Analytics and those posts did not generate enough traffic to warrant keeping them.
What we did to fix the problem
WARNING: By choosing to execute the commands mentioned herein, you acknowledge that you do so at your own risk. The author disclaims any liability for any direct or indirect damages, losses, or issues that may result from the use or misuse of the provided commands in your system.
We understood that the recommended way of keeping extra custom information about posts is to use the wp_postmeta table, which is baked into WordPress. We create a custom table to keep the data. We still indexed by post id, but now we only had 100,000 records vs. millions. The cron job went from taking several minutes to 10 to 20 seconds.
Also, we decided to purge all of the old imported stories so we went from 66,000 posts to 12,000 posts, which greatly improve not only the custom jobs but the WordPress core jobs and plugins jobs.
How we did it
As always, we started with the recommended way of deleting posts, doing a wp_delete_posts(), but because of the number of child posts and posts meta attributes associated with each post, it was taking several minutes to delete each post. If we had 44,000 posts to delete and each post took two minutes to be deleted, it would have taken close to 2 months to complete. It seems crazy, but it makes sense because the wp_delete_posts()_ is optimized to delete one post at a time.
What we decided to do was to take the risk and manually delete the posts and related records manually. Also, we deleted the images that came with the imported posts manually. We were able to do that because we stored the imported images in a separate directory from the default WordPress. We were able to test and retest the process because our hosting vendor allows us to create what they call multi-dev environments as frequently as we need to and then delete them.
Running the queries to delete the wp_posts, wp_postmeta, and wp_term_relationship in the database, we were able to delete all of the posts in less than 5 minutes.
If you want to see the SQL Statements that we ran, see:
How to Manually Purge a Large Number of WordPress Posts in WordPress
What we learned
Even if it is not recommended for SEO purposes to delete posts, you can not say NEVER. Depending on the number of posts that you have, and your hosting budget, you may need to purge old posts. Purging posts is not necessary for everyone, only for websites that generate a great number of posts. But if you need to purge posts, make sure you do it daily so you can use the recommended wp_delete_posts() as posts reach the age that you have decided it’s time to purge.
As a final tip for reading to the end, I would like to introduce you to the WordPress function wp_defer_term_counting(). Calling this function becomes truly beneficial when for instance you are performing a mass insertion or deletion to the database of posts and assigning or removing terms to each object as part of the process. We did implement this function and we trimmed several seconds when both inserting and deleting several posts.
wp_defer_term_counting(true); //defer counting terms
//mass insertion or deletion or posts and assignment of terms here
wp_defer_term_counting(false); //count terms after completing business logic