tidb icon indicating copy to clipboard operation
tidb copied to clipboard

The Plan of TPCH Q3 changes without any data update leading to 13s performance regression

Open Yui-Song opened this issue 3 years ago • 2 comments

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. deploy a tidb cluster: 1 tidb (16c) + 3 TiKV(16c) + 1 PD
  2. restore tpch 50g data
  3. run tpch for 30 mins

2. What did you expect to see? (Required)

The plans of all queries would not change and the plan for Q3 should be stuck to d14988835227e68de9bb1194760cee8e.

3. What did you see instead (Required)

The plan for Q3 would change in some of the daily runs. image

TPCH Q3

	q3 = `
/*PLACEHOLDER*/ select
	l_orderkey,
	sum(l_extendedprice * (1 - l_discount)) as revenue,
	o_orderdate,
	o_shippriority
from
	customer,
	orders,
	lineitem
where
	c_mktsegment = 'AUTOMOBILE'
	and c_custkey = o_custkey
	and l_orderkey = o_orderkey
	and o_orderdate < '1995-03-13'
	and l_shipdate > '1995-03-13'
group by
	l_orderkey,
	o_orderdate,
	o_shippriority
order by
	revenue desc,
	o_orderdate
limit 10;
`
Olap_Detail_Log_ID: 2792562   Plan_Digest: d14988835227e68de9bb1194760cee8e Elapsed_Time (s): 26.5 

+--------------------------------------+-------------+----------+-----------+---------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+
| ID                                   | ESTROWS     | ACTROWS  | TASK      | ACCESS OBJECT                                           | EXECUTION INFO                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | OPERATOR INFO                                                                                                                                                                                                                                   | MEMORY   | DISK    |
+--------------------------------------+-------------+----------+-----------+---------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+
| Projection_14                        | 10.00       | 10       | root      |                                                         | time:26.5s, loops:2, Concurrency:OFF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | test.lineitem.l_orderkey, Column#35, test.orders.o_orderdate, test.orders.o_shippriority                                                                                                                                                        | 2.52 KB  | N/A     |
| └─TopN_17                            | 10.00       | 10       | root      |                                                         | time:26.5s, loops:2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | Column#35:desc, test.orders.o_orderdate, offset:0, count:10                                                                                                                                                                                     | 76.8 KB  | N/A     |
|   └─HashAgg_22                       | 39991142.90 | 565763   | root      |                                                         | time:26.5s, loops:555, partial_worker:{wall_time:26.060572975s, concurrency:5, task_num:1461, tot_wait:2m8.697732643s, tot_exec:1.436736417s, tot_time:2m10.299361175s, max:26.060536947s, p95:26.060536947s}, final_worker:{wall_time:26.522530518s, concurrency:5, task_num:25, tot_wait:2m10.294821085s, tot_exec:2.216177038s, tot_time:2m12.511018685s, max:26.522479673s, p95:26.522479673s}                                                                                                                                                                                                                                                    | group by:Column#48, Column#49, Column#50, funcs:sum(Column#44)->Column#35, funcs:firstrow(Column#45)->test.orders.o_orderdate, funcs:firstrow(Column#46)->test.orders.o_shippriority, funcs:firstrow(Column#47)->test.lineitem.l_orderkey       | 378.4 MB | N/A     |
|     └─Projection_82                  | 92857210.61 | 1495049  | root      |                                                         | time:26s, loops:1462, Concurrency:5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | mul(test.lineitem.l_extendedprice, minus(1, test.lineitem.l_discount))->Column#44, test.orders.o_orderdate, test.orders.o_shippriority, test.lineitem.l_orderkey, test.lineitem.l_orderkey, test.orders.o_orderdate, test.orders.o_shippriority | 1.09 MB  | N/A     |
|       └─IndexHashJoin_30             | 92857210.61 | 1495049  | root      |                                                         | time:26s, loops:1462, inner:{total:2m3.7s, concurrency:5, task:292, construct:8.58s, fetch:1m51.9s, build:1.94s, join:3.19s}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | inner join, inner:IndexLookUp_27, outer key:test.orders.o_orderkey, inner key:test.lineitem.l_orderkey, equal cond:eq(test.orders.o_orderkey, test.lineitem.l_orderkey)                                                                         | 35.5 MB  | N/A     |
|         ├─HashJoin_70(Build)         | 22875928.63 | 7274323  | root      |                                                         | time:6.91s, loops:7108, build_hash_table:{total:891.1ms, fetch:158.5ms, build:732.6ms}, probe:{concurrency:5, total:2m7.2s, max:25.4s, probe:1m53.5s, fetch:13.7s}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | inner join, equal:[eq(test.customer.c_custkey, test.orders.o_custkey)]                                                                                                                                                                          | 141.3 MB | 0 Bytes |
|         │ ├─TableReader_76(Build)    | 1502320.19  | 1501166  | root      |                                                         | time:214.1ms, loops:1463, cop_task: {num: 150, max: 199.7ms, min: 1.34ms, avg: 51.1ms, p95: 173.9ms, max_proc_keys: 185477, p95_proc_keys: 183486, tot_proc: 7.46s, tot_wait: 24ms, rpc_num: 150, rpc_time: 7.66s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}                                                                                                                                                                                                                                                                                                                                                                               | data:Selection_75                                                                                                                                                                                                                               | 9.18 MB  | N/A     |
|         │ │ └─Selection_75           | 1502320.19  | 1501166  | cop[tikv] |                                                         | tikv_task:{proc max:193ms, min:0s, avg: 48.8ms, p80:98ms, p95:168ms, iters:7937, tasks:150}, scan_detail: {total_process_keys: 7500000, total_process_keys_size: 1526085547, total_keys: 7500150, get_snapshot_time: 14.3ms, rocksdb: {key_skipped_count: 7500000, block: {cache_hit_count: 25190}}}                                                                                                                                                                                                                                                                                                                                                  | eq(test.customer.c_mktsegment, "AUTOMOBILE")                                                                                                                                                                                                    | N/A      | N/A     |
|         │ │   └─TableFullScan_74     | 7500000.00  | 7500000  | cop[tikv] | table:customer                                          | tikv_task:{proc max:169ms, min:0s, avg: 43.1ms, p80:87ms, p95:149ms, iters:7937, tasks:150}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | keep order:false                                                                                                                                                                                                                                | N/A      | N/A     |
|         │ └─TableReader_73(Probe)    | 36347384.33 | 36374625 | root      |                                                         | time:2.31s, loops:35402, cop_task: {num: 1615, max: 272.3ms, min: 1.22ms, avg: 63.1ms, p95: 173.8ms, max_proc_keys: 104416, p95_proc_keys: 104416, tot_proc: 1m37.8s, tot_wait: 597ms, rpc_num: 1615, rpc_time: 1m41.9s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}                                                                                                                                                                                                                                                                                                                                                                         | data:Selection_72                                                                                                                                                                                                                               | 24.6 MB  | N/A     |
|         │   └─Selection_72           | 36347384.33 | 36374625 | cop[tikv] |                                                         | tikv_task:{proc max:234ms, min:0s, avg: 57.5ms, p80:117ms, p95:160ms, iters:79749, tasks:1615}, scan_detail: {total_process_keys: 75000000, total_process_keys_size: 11391895327, total_keys: 75001615, get_snapshot_time: 84.9ms, rocksdb: {key_skipped_count: 75000000, block: {cache_hit_count: 172276, read_count: 20178, read_byte: 337.6 MB, read_time: 146.8ms}}}                                                                                                                                                                                                                                                                              | lt(test.orders.o_orderdate, 1995-03-13 00:00:00.000000)                                                                                                                                                                                         | N/A      | N/A     |
|         │     └─TableFullScan_71     | 75000000.00 | 75000000 | cop[tikv] | table:orders                                            | tikv_task:{proc max:223ms, min:0s, avg: 54ms, p80:109ms, p95:150ms, iters:79749, tasks:1615}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | keep order:false                                                                                                                                                                                                                                | N/A      | N/A     |
|         └─IndexLookUp_27(Probe)      | 4.06        | 1495049  | root      |                                                         | time:1m45.4s, loops:1958, index_task: {total_time: 1m24.1s, fetch_handle: 1m24.1s, build: 4.73ms, wait: 13.1ms}, table_task: {total_time: 1m51.7s, num: 2583, concurrency: 5}                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                                                                                                                                                                                                                                 | 155.7 KB | N/A     |
|           ├─IndexRangeScan_24(Build) | 7.50        | 29096047 | cop[tikv] | table:lineitem, index:PRIMARY(L_ORDERKEY, L_LINENUMBER) | time:1m21.7s, loops:30459, cop_task: {num: 12290, max: 156.9ms, min: 482.9µs, avg: 21ms, p95: 58.5ms, max_proc_keys: 26443, p95_proc_keys: 9184, tot_proc: 2m58s, tot_wait: 13.5s, rpc_num: 12290, rpc_time: 4m18.1s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, tikv_task:{proc max:135ms, min:0s, avg: 13.6ms, p80:22ms, p95:48ms, iters:74689, tasks:12290}, scan_detail: {total_process_keys: 29096047, total_process_keys_size: 1542090491, total_keys: 36380592, get_snapshot_time: 931ms, rocksdb: {key_skipped_count: 29096047, block: {cache_hit_count: 14364485, read_count: 226140, read_byte: 1023.0 MB, read_time: 1.03s}}}   | range: decided by [eq(test.lineitem.l_orderkey, test.orders.o_orderkey)], keep order:false                                                                                                                                                      | N/A      | N/A     |
|           └─Selection_26(Probe)      | 4.06        | 1495049  | cop[tikv] |                                                         | time:1m41.8s, loops:5703, cop_task: {num: 10316, max: 193.4ms, min: 379.4µs, avg: 20.8ms, p95: 63.1ms, max_proc_keys: 20984, p95_proc_keys: 9360, tot_proc: 2m59.9s, tot_wait: 13.6s, rpc_num: 10316, rpc_time: 3m33.9s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, tikv_task:{proc max:186ms, min:0s, avg: 16.5ms, p80:28ms, p95:56ms, iters:73336, tasks:10316}, scan_detail: {total_process_keys: 29096047, total_process_keys_size: 5779934528, total_keys: 34821877, get_snapshot_time: 1.43s, rocksdb: {key_skipped_count: 28246157, block: {cache_hit_count: 12914527, read_count: 59918, read_byte: 1.29 GB, read_time: 697.6ms}}} | gt(test.lineitem.l_shipdate, 1995-03-13 00:00:00.000000)                                                                                                                                                                                        | N/A      | N/A     |
|             └─TableRowIDScan_25      | 7.50        | 29096047 | cop[tikv] | table:lineitem                                          | tikv_task:{proc max:185ms, min:0s, avg: 16.2ms, p80:28ms, p95:55ms, iters:73336, tasks:10316}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | keep order:false                                                                                                                                                                                                                                | N/A      | N/A     |
+--------------------------------------+-------------+----------+-----------+---------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+

Olap_Detail_Log_ID: 2792540   Plan_Digest: f8e52347ef089dc357e3ff1704d5415e Elapsed_Time (s): 39.0 

+-----------------------------------+--------------+-----------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+
| ID                                | ESTROWS      | ACTROWS   | TASK      | ACCESS OBJECT  | EXECUTION INFO                                                                                                                                                                                                                                                                                                                                                                                  | OPERATOR INFO                                                                                                                                                                                                                                   | MEMORY   | DISK    |
+-----------------------------------+--------------+-----------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+
| Projection_14                     | 10.00        | 10        | root      |                | time:39s, loops:2, Concurrency:OFF                                                                                                                                                                                                                                                                                                                                                              | test.lineitem.l_orderkey, Column#35, test.orders.o_orderdate, test.orders.o_shippriority                                                                                                                                                        | 2.52 KB  | N/A     |
| └─TopN_17                         | 10.00        | 10        | root      |                | time:39s, loops:2                                                                                                                                                                                                                                                                                                                                                                               | Column#35:desc, test.orders.o_orderdate, offset:0, count:10                                                                                                                                                                                     | 76.8 KB  | N/A     |
|   └─HashAgg_22                    | 39759090.21  | 565763    | root      |                | time:39s, loops:555, partial_worker:{wall_time:38.571025254s, concurrency:5, task_num:1461, tot_wait:3m11.057657942s, tot_exec:1.598260816s, tot_time:3m12.821384773s, max:38.570971871s, p95:38.570971871s}, final_worker:{wall_time:39.010329063s, concurrency:5, task_num:25, tot_wait:3m12.77921226s, tot_exec:2.193400817s, tot_time:3m14.97263345s, max:39.010246209s, p95:39.010246209s} | group by:Column#48, Column#49, Column#50, funcs:sum(Column#44)->Column#35, funcs:firstrow(Column#45)->test.orders.o_orderdate, funcs:firstrow(Column#46)->test.orders.o_shippriority, funcs:firstrow(Column#47)->test.lineitem.l_orderkey       | 378.4 MB | N/A     |
|     └─Projection_82               | 92857210.61  | 1495049   | root      |                | time:38.5s, loops:1462, Concurrency:5                                                                                                                                                                                                                                                                                                                                                           | mul(test.lineitem.l_extendedprice, minus(1, test.lineitem.l_discount))->Column#44, test.orders.o_orderdate, test.orders.o_shippriority, test.lineitem.l_orderkey, test.lineitem.l_orderkey, test.orders.o_orderdate, test.orders.o_shippriority | 1.09 MB  | N/A     |
|       └─HashJoin_39               | 92857210.61  | 1495049   | root      |                | time:38.5s, loops:1462, build_hash_table:{total:9.95s, fetch:5.45s, build:4.49s}, probe:{concurrency:5, total:3m12.6s, max:38.5s, probe:54.8s, fetch:2m17.8s}                                                                                                                                                                                                                                   | inner join, equal:[eq(test.orders.o_orderkey, test.lineitem.l_orderkey)]                                                                                                                                                                        | 567.5 MB | 0 Bytes |
|         ├─HashJoin_70(Build)      | 22875928.63  | 7274323   | root      |                | time:7.69s, loops:7107, build_hash_table:{total:899.1ms, fetch:108ms, build:791.1ms}, probe:{concurrency:5, total:49.7s, max:9.94s, probe:27.5s, fetch:22.3s}                                                                                                                                                                                                                                   | inner join, equal:[eq(test.customer.c_custkey, test.orders.o_custkey)]                                                                                                                                                                          | 141.3 MB | 0 Bytes |
|         │ ├─TableReader_76(Build) | 1502320.19   | 1501166   | root      |                | time:193ms, loops:1463, cop_task: {num: 150, max: 189ms, min: 805.7µs, avg: 51.9ms, p95: 173.7ms, max_proc_keys: 185477, p95_proc_keys: 183486, tot_proc: 7.57s, tot_wait: 26ms, rpc_num: 150, rpc_time: 7.78s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}                                                                                                                            | data:Selection_75                                                                                                                                                                                                                               | 10.5 MB  | N/A     |
|         │ │ └─Selection_75        | 1502320.19   | 1501166   | cop[tikv] |                | tikv_task:{proc max:181ms, min:1ms, avg: 49.4ms, p80:117ms, p95:167ms, iters:7937, tasks:150}, scan_detail: {total_process_keys: 7500000, total_process_keys_size: 1526085547, total_keys: 7500150, get_snapshot_time: 15ms, rocksdb: {key_skipped_count: 7500000, block: {cache_hit_count: 25190}}}                                                                                            | eq(test.customer.c_mktsegment, "AUTOMOBILE")                                                                                                                                                                                                    | N/A      | N/A     |
|         │ │   └─TableFullScan_74  | 7500000.00   | 7500000   | cop[tikv] | table:customer | tikv_task:{proc max:159ms, min:0s, avg: 43.4ms, p80:104ms, p95:148ms, iters:7937, tasks:150}                                                                                                                                                                                                                                                                                                    | keep order:false                                                                                                                                                                                                                                | N/A      | N/A     |
|         │ └─TableReader_73(Probe) | 36347384.33  | 36374625  | root      |                | time:4s, loops:35403, cop_task: {num: 1615, max: 284.6ms, min: 1.41ms, avg: 69ms, p95: 165.6ms, max_proc_keys: 104416, p95_proc_keys: 104416, tot_proc: 1m47.7s, tot_wait: 168ms, rpc_num: 1615, rpc_time: 1m51.5s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}                                                                                                                        | data:Selection_72                                                                                                                                                                                                                               | 13.9 MB  | N/A     |
|         │   └─Selection_72        | 36347384.33  | 36374625  | cop[tikv] |                | tikv_task:{proc max:216ms, min:0s, avg: 64.4ms, p80:132ms, p95:157ms, iters:79749, tasks:1615}, scan_detail: {total_process_keys: 75000000, total_process_keys_size: 11391895327, total_keys: 75001615, get_snapshot_time: 104.3ms, rocksdb: {key_skipped_count: 75000000, block: {cache_hit_count: 3121, read_count: 189333, read_byte: 3.09 GB, read_time: 1.28s}}}                           | lt(test.orders.o_orderdate, 1995-03-13 00:00:00.000000)                                                                                                                                                                                         | N/A      | N/A     |
|         │     └─TableFullScan_71  | 75000000.00  | 75000000  | cop[tikv] | table:orders   | tikv_task:{proc max:203ms, min:0s, avg: 61.5ms, p80:126ms, p95:150ms, iters:79749, tasks:1615}                                                                                                                                                                                                                                                                                                  | keep order:false                                                                                                                                                                                                                                | N/A      | N/A     |
|         └─TableReader_79(Probe)   | 161388779.98 | 161995407 | root      |                | time:19.9s, loops:157662, cop_task: {num: 7962, max: 359.2ms, min: 855.3µs, avg: 50.4ms, p95: 136.1ms, max_proc_keys: 95200, p95_proc_keys: 94176, tot_proc: 5m57.4s, tot_wait: 833ms, rpc_num: 7962, rpc_time: 6m40.9s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}                                                                                                                   | data:Selection_78                                                                                                                                                                                                                               | 29.2 MB  | N/A     |
|           └─Selection_78          | 161388779.98 | 161995407 | cop[tikv] |                | tikv_task:{proc max:173ms, min:0s, avg: 40.4ms, p80:87ms, p95:112ms, iters:324826, tasks:7962}, scan_detail: {total_process_keys: 300005811, total_process_keys_size: 59595430182, total_keys: 300013773, get_snapshot_time: 482.3ms, rocksdb: {key_skipped_count: 300005811, block: {cache_hit_count: 803520, read_count: 185706, read_byte: 2.93 GB, read_time: 1.27s}}}                      | gt(test.lineitem.l_shipdate, 1995-03-13 00:00:00.000000)                                                                                                                                                                                        | N/A      | N/A     |
|             └─TableFullScan_77    | 300005811.00 | 300005811 | cop[tikv] | table:lineitem | tikv_task:{proc max:163ms, min:0s, avg: 38.2ms, p80:83ms, p95:106ms, iters:324826, tasks:7962}                                                                                                                                                                                                                                                                                                  | keep order:false                                                                                                                                                                                                                                | N/A      | N/A     |
+-----------------------------------+--------------+-----------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+

4. What is your TiDB version? (Required)

nightly

Yui-Song avatar Aug 30 '22 07:08 Yui-Song

q3-bad.zip q3-good.zip

The difference between good plan and bad plan is the Join and datasource for lineItem. For good plan, it's IndexHashJoin + IndexLookUp, and for bad plan it's HashJoin + TableReader.

Both cost in the plan replayer for good plan like following:

image image

Both cost in the plan replayer for bad plan like following: image

As you can see, the difference between the cost of indexHashJoin and HashJoin is quite small which caused the q3 plan unstable.

Yisaer avatar Sep 05 '22 05:09 Yisaer

Reason why the plan is unstable: 1) costs of these 2 plans (HJ and INLJ) are too close in cost-model-ver1 (as @Yisaer said), 2) the randomness of statistics is amplified in join estimation. image

The main reason to cause the optimizer cannot select the right plan INLJ stably in this case is that the join estimation is over-estimated a lot, which let the optimizer prefer to use HJ instead of INLJ. image

qw4990 avatar Sep 20 '22 04:09 qw4990

This issue can be solved by Cost Model Ver2.

qw4990 avatar Nov 17 '22 07:11 qw4990

Confirmed the issue was solved. Close the issue.

Yui-Song avatar Jun 05 '23 03:06 Yui-Song

/close

Yui-Song avatar Jun 05 '23 03:06 Yui-Song

@Yui-Song: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ti-chi-bot[bot] avatar Jun 05 '23 03:06 ti-chi-bot[bot]