BUG: EP status page page=elasticpress-status-report is very slow (18s+) due to multiple huge "Distinct Meta Keys" queries
Describe the bug
Whenever we load the EP status page /wordpress/wp-admin/admin.php?page=elasticpress-status-report, it takes 18-20s. I finally installed Query Monitor and it showed that this page runs 32 (currently) humongous queries like this:
SELECT DISTINCT meta_key
FROM wp_postmeta
WHERE post_id IN ( 642170,642211,642481,643012,643084,643098,643128,643348,643567,644581,645503,645518,645616,645655,645805,646050,646069,646184,646657,646727,646833,647514,647765,647900,647912,647924,647927,647969,647978,648409,648629,648632,648658,648664,648780,648867,649360,649363,649389,649392, ..., 1879856,1879890,1879921,1879941,1880037,1880040,1880145,1880229,1880308,1880326,1880447,1880454,1880602,1880718,1880737 )
each taking just under 1 second. It looks like it's batching IDs in batches of 10100 (weird number, but OK) and doing 32 queries. Presumably, all to populate this field "Distinct Meta Keys"?
There has to be a better way to do this? Maybe load this information via Ajax on demand or optimize the queries so that they run faster and perhaps without listing hundreds of thousands of post IDs?
Steps to Reproduce
- Load /wordpress/wp-admin/admin.php?page=elasticpress-status-report on a large site.
Screenshots, screen recording, code snippet
Environment information
No response
WordPress and ElasticPress information
Partially redacted.
WordPress
WordPress Environment
wp_version: 6.4.1 is_multisite: false revisions: 15
Server Environment
php_version: 7.4.33 memory_limit: 1G timeout: 600
Indexable Content
XXXXXXXXXXXXX — https://www.XXXXXXXXXXXXX.com/wordpress
post_count: 0 page_count: 9 XXXXXXXXXXXXX: 877,888 XXXXXXXXXXXXX: 1,441,662 XXXXXXXXXXXXX: 565,659 attachment_count: 154,949 XXXXXXXXXXXXX: 23 XXXXXXXXXXXXX: 15 XXXXXXXXXXXXX: 14 total-all-post-types: 25 distinct-meta-keys: XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, XXXXXXXXXXXXX, and XXXXXXXXXXXXX
ElasticPress
Settings
host: http://XXXXXXXXXXXXX:9200 index_prefix: language: en_US per_page: 350 network_active: false
Timeouts
request_timeout: 5 index_document_timeout: 15 bulk_request_timeout: 30
Elasticsearch Indices
XXXXXXXXXXXXX
health: green status: open index: XXXXXXXXXXXXX uuid: XXXXXXXXXXXXX pri: 20 rep: 3 docs.count: 3040168 docs.deleted: 405218 store.size: 28.2gb pri.store.size: 7gb total_fields_limit: 5000
Last Sync
2023/08/02 6:22:57 pm
method: WP Dashboard end_date_time: 2023/08/09 3:55:23 am total_time: 9 hours, 32 minutes, 27 seconds total: 3388535 synced: 3410235 skipped: 0 failed: 0 errors: array ( )
Feature Settings
Custom Search Results
active: true force_inactive: false
Facets
active: true force_inactive: false match_type: all
Post Search
active: true decaying_enabled: true force_inactive: false highlight_enabled: false highlight_excerpt: false highlight_tag: mark synonyms_editor_mode: simple
Protected Content
active: true force_inactive: false
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Loading this via Ajax seems to be the best approach for now (believe it or not, the query is already optimized for what we need :( )
Adding this to the 5.2.0 milestone for now. In the meantime, it is possible to bypass that report entirely with this snippet:
add_filter( 'ep_status_report_reports', function( $reports ) {
unset( $reports['indexable'] );
return $reports;
});
Thanks for the snippet, it indeed makes the status page fly (0.3s), but it also removes all of the data rather than just the Distinct Meta Keys bit. I'll put it in temporarily until this bug is resolved and remove it after. Looking forward to the proper fix.