Slow Query on mysql.stats_meta Table
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
- Create a huge cluster with heavy workload.
- Check the slow logs.
2. What did you expect to see? (Required)
No slow queries on mysql.stats_meta table.
3. What did you see instead (Required)
SELECT version, table_id, modify_count, count, snapshot, last_stats_histograms_version FROM mysql.stats_meta WHERE version > 462255440511108949 ORDER BY version;
| id | estRows | estCost | actRows | task | access object | execution info | operator info | memory | disk |
| Sort_5 | 3333.33 | 2467712.52 | 587 | root | | time:2.87s, loops:2 | mysql.stats_meta.version | 42.1 KB | 0 Bytes |
| └─TableReader_10 | 3333.33 | 487159.13 | 587 | root | | time:2.87s, loops:2, cop_task: {num: 6, max: 2.87s, min: 2.42ms, avg: 499.4ms, p95: 2.87s, max_proc_keys: 85507, p95_proc_keys: 85507, tot_proc: 156ms, tot_wait: 861.5µs, copr_cache_hit_ratio: 0.00, build_task_duration: 22.9µs, max_distsql_concurrency: 6}, rpc_info:{Cop:{num_rpc:6, total_time:3s}} | data:Selection_9 | 11.5 KB | N/A |
| └─Selection_9 | 3333.33 | 6040186.91 | 587 | cop[tikv] | | tikv_task:{proc max:2.86s, min:0s, avg: 496.7ms, p80:50ms, p95:2.86s, iters:232, tasks:6}, scan_detail: {total_process_keys: 211785, total_process_keys_size: 14752565, total_keys: 293431, get_snapshot_time: 46.7µs, rocksdb: {delete_skipped_count: 192, key_skipped_count: 361746, block: {cache_hit_count: 2006}}}, time_detail: {total_process_time: 156ms, total_suspend_time: 2.83s, total_wait_time: 861.5µs, total_kv_read_wall_time: 2.98s, tikv_wall_time: 2.99s} | gt(mysql.stats_meta.version, 462255440511108949) | N/A | N/A |
| └─TableFullScan_8 | 10000.00 | 5541186.91 | 211785 | cop[tikv] | table:stats_meta | tikv_task:{proc max:2.86s, min:0s, avg: 496.7ms, p80:50ms, p95:2.86s, iters:232, tasks:6} | keep order:false, stats:pseudo | N/A | N/A |
4. What is your TiDB version? (Required)
master, 8.5.x
We should use index range scan here. We could hardcode the hint in the code.
Another slow query from a different cluster:
| id | estRows | estCost | actRows | task | access object | execution info | operator info | memory | disk |
| Sort_5 | 3333.33 | 2467712.52 | 883 | root | | time:1.03s, loops:2 | mysql.stats_meta.version | 70.8 KB | 0 Bytes |
| └─TableReader_10 | 3333.33 | 487159.13 | 883 | root | | time:1.03s, loops:2, cop_task: {num: 2, max: 1.03s, min: 9.54ms, avg: 518.1ms, p95: 1.03s, max_proc_keys: 5974, p95_proc_keys: 5974, tot_proc: 1.03s, tot_wait: 43.4µs, copr_cache_hit_ratio: 0.00, build_task_duration: 25.2µs, max_distsql_concurrency: 2}, rpc_info:{Cop:{num_rpc:3, total_time:1.04s}, rpc_errors:{not_leader:1}} | data:Selection_9 | 33.6 KB | N/A |
| └─Selection_9 | 3333.33 | 6040186.91 | 883 | cop[tikv] | | tikv_task:{proc max:1.02s, min:0s, avg: 510ms, p80:1.02s, p95:1.02s, iters:16, tasks:2}, scan_detail: {total_process_keys: 7905, total_process_keys_size: 577992, total_keys: 44515, get_snapshot_time: 13µs, rocksdb: {delete_skipped_count: 14, key_skipped_count: 39785, block: {cache_hit_count: 906, read_count: 2419, read_byte: 17.1 MB, read_time: 878.4ms}}}, time_detail: {total_process_time: 1.03s, total_suspend_time: 713µs, total_wait_time: 43.4µs, total_kv_read_wall_time: 1.02s, tikv_wall_time: 1.03s} | gt(mysql.stats_meta.version, 462914649617858921) | N/A | N/A |
| └─TableFullScan_8 | 10000.00 | 5541186.91 | 7905 | cop[tikv] | table:stats_meta | tikv_task:{proc max:1.02s, min:0s, avg: 510ms, p80:1.02s, p95:1.02s, iters:16, tasks:2} | keep order:false, stats:pseudo | N/A | N/A |
We recognized two patterns:
total_suspend_time too long
| id | estRows | estCost | actRows | task | access object | execution info | operator info | memory | disk |
| Sort_5 | 3333.33 | 2467712.52 | 159 | root | | time:1.35s, loops:2 | mysql.stats_meta.version | 11.6 KB | 0 Bytes |
| └─TableReader_10 | 3333.33 | 487159.13 | 159 | root | | time:1.35s, loops:2, cop_task: {num: 2, max: 1.35s, min: 29.7ms, avg: 687.4ms, p95: 1.35s, max_proc_keys: 5974, p95_proc_keys: 5974, tot_proc: 37ms, tot_wait: 100.5µs, copr_cache_hit_ratio: 0.00, build_task_duration: 13.5µs, max_distsql_concurrency: 2}, rpc_info:{Cop:{num_rpc:2, total_time:1.37s}} | data:Selection_9 | 6.56 KB | N/A |
| └─Selection_9 | 3333.33 | 6040186.91 | 159 | cop[tikv] | | tikv_task:{proc max:1.35s, min:30ms, avg: 690ms, p80:1.35s, p95:1.35s, iters:16, tasks:2}, scan_detail: {total_process_keys: 7811, total_process_keys_size: 570623, total_keys: 44040, get_snapshot_time: 16.5µs, rocksdb: {delete_skipped_count: 7, key_skipped_count: 39237, block: {cache_hit_count: 3315}}}, time_detail: {total_process_time: 37ms, total_suspend_time: 1.34s, total_wait_time: 100.5µs, total_kv_read_wall_time: 1.38s, tikv_wall_time: 1.37s} | gt(mysql.stats_meta.version, 462907117997916267) | N/A | N/A |
| └─TableFullScan_8 | 10000.00 | 5541186.91 | 7811 | cop[tikv] | table:stats_meta | tikv_task:{proc max:1.35s, min:30ms, avg: 690ms, p80:1.35s, p95:1.35s, iters:16, tasks:2} | keep order:false, stats:pseudo | N/A | N/A |
Based on the monitoring results, it can be concluded that the large number of new SELECT queries led to IOPS exhaustion, resulting in slower SELECT queries. Furthermore, other query tasks also affected the querying of this system table, causing it to slow down as well.
block read_time too long
| id | estRows | estCost | actRows | task | access object | execution info | operator info | memory | disk |
| Sort_5 | 3333.33 | 2467712.52 | 883 | root | | time:1.03s, loops:2 | mysql.stats_meta.version | 70.8 KB | 0 Bytes |
| └─TableReader_10 | 3333.33 | 487159.13 | 883 | root | | time:1.03s, loops:2, cop_task: {num: 2, max: 1.03s, min: 9.54ms, avg: 518.1ms, p95: 1.03s, max_proc_keys: 5974, p95_proc_keys: 5974, tot_proc: 1.03s, tot_wait: 43.4µs, copr_cache_hit_ratio: 0.00, build_task_duration: 25.2µs, max_distsql_concurrency: 2}, rpc_info:{Cop:{num_rpc:3, total_time:1.04s}, rpc_errors:{not_leader:1}} | data:Selection_9 | 33.6 KB | N/A |
| └─Selection_9 | 3333.33 | 6040186.91 | 883 | cop[tikv] | | tikv_task:{proc max:1.02s, min:0s, avg: 510ms, p80:1.02s, p95:1.02s, iters:16, tasks:2}, scan_detail: {total_process_keys: 7905, total_process_keys_size: 577992, total_keys: 44515, get_snapshot_time: 13µs, rocksdb: {delete_skipped_count: 14, key_skipped_count: 39785, block: {cache_hit_count: 906, read_count: 2419, read_byte: 17.1 MB, read_time: 878.4ms}}}, time_detail: {total_process_time: 1.03s, total_suspend_time: 713µs, total_wait_time: 43.4µs, total_kv_read_wall_time: 1.02s, tikv_wall_time: 1.03s} | gt(mysql.stats_meta.version, 462914649617858921) | N/A | N/A |
| └─TableFullScan_8 | 10000.00 | 5541186.91 | 7905 | cop[tikv] | table:stats_meta | tikv_task:{proc max:1.02s, min:0s, avg: 510ms, p80:1.02s, p95:1.02s, iters:16, tasks:2} | keep order:false, stats:pseudo | N/A | N/A |
For this one, block: {cache_hit_count: 906, read_count: 2419, read_byte: 17.1 MB, read_time: 878.4ms}, This seems to be processing slowly. There are 40k keys in total, requiring over 2k blocks to load. The MVCC versions are excessive.