lightweight-term-count-update icon indicating copy to clipboard operation
lightweight-term-count-update copied to clipboard

Improve performance for (child) attachments

Open mboynes opened this issue 7 years ago • 2 comments

From #2 (direct link to PR comment):

This returns only latest 10 attachments. I'm not sure whether setting the posts_per_page should be set to -1 or whether we should come up with some clever direct SQL. Thoughts?

I was wondering that too. I'm not sure what to do here, to be honest, but I do have some suggestions. To maintain parity with core, we would need to set 'posts_per_page' => -1 to catch all possible attachments, but that could theoretically be worse than _update_post_term_count(). Here are a few ideas:

  1. We could query just for 'fields' => 'ids'. All we ever use of the post objects are the ID and post_type, and we already know the post type is attachment. This would keep the query light, but it still means that every attachment could theoretically trigger another query to update the term count, so if a post has 100k attachments we could theoretically be looking at 100,002 total sql queries.
    • A sub-option here would be to refactor the code to be able to pool the attachment queries into one query. That would turn the 100,002 queries down to 1 or 2 queries, where one of them would have 100k+ tt_ids in the IN() clause.
  2. We could do a quick count synchronously for the post and then setup an asynchronous cron task to count attachments, if there are any (or if there are more than, say, 100).
  3. Count the attachments, and if the number is high, just let _update_post_term_count() run as it usually would. Additionally, we could add a filter to optionally override it, and then people can intervene with custom code if it's a real problem for them, and do something else like (1) or (2) or offload the query to Elasticsearch or something.

(1) or (1a) seem like the best options to me of what I've considered. (2) feels over-engineered for the 0.001%, or whatever hilariously low percentage it would be, of use cases that could be problematic. (3) is a fine option too -- we fixed all the use cases most people will ever encounter, and for that one in a million, we simply left it as-is but opened a door for that site to easily solve their problem.

For the first release of the plugin, we went with option (1). This does have potential to be slow, as noted above, so other options should still be considered.

mboynes avatar Mar 28 '17 16:03 mboynes

I've been thinking about this too, and have come to like method 3 as described by you guys above. I've come up with a possible way to do this. Main points:

  • a filterable attachment_limit is added (I defaulted to 1000 but perhaps it should be lower?) This can be modified with the ltcu_attachment_limit filter. Filtering to -1 means the shortcut will always happen regardless of number of attachments.
  • the attachment check on transition_post_status is bumped up so it happens before anything else. We avoid unnecessary no-limit queries by asking for one more attachment than the limit. If the limit is exceeded, first try to call anything hooked to ltcu_alternate_transition_post_status ... if nothing is there, go ahead with:
wp_defer_term_counting( false );
_update_term_count_on_transition_post_status( $new_status, $old_status, $post );

(I'm a bit uncomfortable with calling an underscored function like that, but it seemed like the simplest way forward without copying some code from core.)

https://github.com/Automattic/lightweight-term-count-update/commit/37f1800241849fd29f460cf8be150297fae2afe8 is the main commit

https://github.com/Automattic/lightweight-term-count-update/commit/21ef2547f0b0e72df7eba56268b19a2fdd4e1655 fixes a small additional bug

This needs lots of testing, but I wanted to push this while i was thinking about it.

mattoperry avatar Mar 29 '17 21:03 mattoperry

hey @david-binda would you mind having a look at this? Curious what you think ...

mattoperry avatar Mar 30 '17 21:03 mattoperry