planner: avoid exceeding the configured concurrency limit (#61786)
This is an automated cherry-pick of #61786
What problem does this PR solve?
Issue Number: close #61785
Problem Summary:
The issue is that customers have observed higher I/O consumption when the analyze operation reaches the index, compared to when it analyzes regular tables. (The analyze status contains sensitive information, so it will not be included here.)
The root cause of the issue lies in improper coding practices. When we perform the analyze operation, we create multiple concurrent tasks to execute it. However, within these concurrently spawned goroutines, we further create additional concurrency. This nested concurrency results in an actual level of parallelism that is significantly higher than we anticipated.
CREATE TABLE `test` (
`c1` binary(16) NOT NULL,
`c2` tinyint(1) NOT NULL DEFAULT '0',
`c3` int NOT NULL,
`c4` varchar(48) COLLATE utf8mb4_general_ci NOT NULL,
`c5` varchar(512) COLLATE utf8mb4_general_ci DEFAULT NULL,
`c6` enum('A','B','C') COLLATE utf8mb4_general_ci DEFAULT NULL,
`c7` int unsigned NOT NULL DEFAULT '0',
`c8` int unsigned NOT NULL DEFAULT '0',
`c9` tinyint(1) GENERATED ALWAYS AS (`c7` > 0) VIRTUAL NOT NULL,
`c10` int DEFAULT NULL,
`c11` datetime(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
`c12` datetime(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
PRIMARY KEY (`c1`) /*T![clustered_index] CLUSTERED */,
KEY `idx_c4_c2_c9_c3_c12_c5_c6` (`c4`,`c2`,`c9`,`c3`,`c12`,`c5`,`c6`),
KEY `idx_c4_c2_c9_c12_c5_c6` (`c4`,`c2`,`c9`,`c12`,`c5`,`c6`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
analyze table chat_session all columns ;
show analyze status
+--------------+------------+----------------+-----------------------------------------------------------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+-------------------+----------+----------------------+
| Table_schema | Table_name | Partition_name | Job_info | Processed_rows | Start_time | End_time | State | Fail_reason | Instance | Process_ID | Remaining_seconds | Progress | Estimated_total_rows |
+--------------+------------+----------------+-----------------------------------------------------------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+-------------------+----------+----------------------+
| test | test | | analyze ndv for index idx_c4_c2_c9_c12_c5_c6 | 0 | 2025-06-18 14:48:05 | 2025-06-18 14:48:05 | finished | <null> | 127.0.0.1:4000 | <null> | <null> | <null> | <null> |
| test | test | | analyze ndv for index idx_c4_c2_c9_c3_c12_c5_c6 | 0 | 2025-06-18 14:48:05 | 2025-06-18 14:48:05 | finished | <null> | 127.0.0.1:4000 | <null> | <null> | <null> | <null> |
| test | test | | analyze table all indexes, columns c1, c2, c3, c4, c5, c6, c7, c9, c12 with 256 buckets, 100 topn, 1 samplerate | 0 | 2025-06-18 14:48:05 | 2025-06-18 14:48:05 | finished | <null> | 127.0.0.1:4000 | <null> | <null> | <null> | <null> |
+--------------+------------+----------------+-----------------------------------------------------------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+-------------------+----------+----------------------+
You will see that it will create two task about analyze ndv for index.
the problem is here.
The first creation of concurrency
https://github.com/pingcap/tidb/blob/8fc1430b8340589d2967697a457c730caef1f9ba/pkg/executor/analyze.go#L121-L126
The second creation of concurrency
AnalyzeExec.analyzeWorker -> analyzeColumnsPushDownEntry -> analyzeColumnsPushDownV2
https://github.com/pingcap/tidb/blob/master/pkg/executor/analyze_col_v2.go#L105-L107
The third creation of concurrency
https://github.com/pingcap/tidb/blob/8fc1430b8340589d2967697a457c730caef1f9ba/pkg/executor/analyze_col_v2.go#L461-L466
This part is actually the most dangerous. It allows the concurrency of handleNDVForSpecialIndexes and the concurrency of column collection to coexist, which increases the business risk.
What changed and how does it work?
1ãWait untilhandleNDVForSpecialIndexesis completed before proceeding with the statistics collection for columns.
2ãTo prevent modifying the build stats concurrency, which could result in an exponential relationship in the actual number of concurrent tasks, we set the concurrency here to be the same as the build sampling concurrency.
Check List
Tests
- [x] Unit test
- [ ] Integration test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test
- [ ] I checked and no code files have been changed.
Side effects
- [ ] Performance regression: Consumes more CPU
- [ ] Performance regression: Consumes more Memory
- [ ] Breaking backward compatibility
Documentation
- [ ] Affects user behaviors
- [ ] Contains syntax changes
- [ ] Contains variable changes
- [ ] Contains experimental features
- [ ] Changes MySQL compatibility
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
None
@hawkingrei This PR has conflicts, I have hold it.
Please resolve them or ask others to resolve them, then comment /unhold to remove the hold label.
/unhold
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: AilinKid, hawkingrei
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [AilinKid,hawkingrei]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
[LGTM Timeline notifier]
Timeline:
2025-07-04 23:33:38.094147216 +0000 UTC m=+1697070.817326196: :ballot_box_with_check: agreed by hawkingrei.2025-07-08 09:34:30.982899874 +0000 UTC m=+1992323.706078856: :ballot_box_with_check: agreed by AilinKid.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Please upload report for BASE (
release-7.5@469af9d). Learn more about missing BASE report.
Additional details and impacted files
@@ Coverage Diff @@
## release-7.5 #61813 +/- ##
================================================
Coverage ? 72.2023%
================================================
Files ? 1417
Lines ? 414294
Branches ? 0
================================================
Hits ? 299130
Misses ? 95172
Partials ? 19992
| Flag | Coverage Δ | |
|---|---|---|
| unit | 72.2023% <100.0000%> (?) |
Flags with carried forward coverage won't be shown. Click here to find out more.
| Components | Coverage Δ | |
|---|---|---|
| dumpling | 52.9400% <0.0000%> (?) |
|
| parser | ∅ <0.0000%> (?) |
|
| br | 53.5323% <0.0000%> (?) |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
/retest