other error: Coprocessor task terminated due to exceeding the deadline
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
- create a partition table with 200 partitions;
- insert about 250000000 rows
- execute SQL
select count(1) from t
2. What did you expect to see? (Required)
query success
3. What did you see instead (Required)
[10:07:31]TiDB root:test> select count(1) from ios3;
(1105, 'other error: Coprocessor task terminated due to exceeding the deadline')
tidb log
[2023/10/24 02:08:32.299 +00:00] [WARN] [coprocessor.go:1413] ["other error"] [conn=553656796] [session_alias=] [txnStartTS=445150200404115457] [regionID=4690463] [bucketsVer=0] [latestBucketsVer=0] [rangeNums=160] [firstRangeStartKey="t\ufffd\u0000\u0000\u0000\u0000\u000f\u000e\ufffd_i\ufffd\u0000\u0000\u0000\u0000\u0000\u0000\u0002\u0000"] [lastRangeEndKey="t\ufffd\u0000\u0000\u0000\u0000\u000f\u000f\ufffd_i\ufffd\u0000\u0000\u0000\u0000\u0000\u0000\u0002\ufffd"] [storeAddr=tc-tikv-2.tc-tikv-peer.partition-analyze-test-tps-3180202-1-85.svc:20160] [error="other error: Coprocessor task terminated due to exceeding the deadline"]
[2023/10/24 02:08:32.300 +00:00] [INFO] [conn.go:1098] ["command dispatched failed"] [conn=553656796] [session_alias=] [connInfo="id:553656796, addr:10.200.24.43:63693 status:10, collation:utf8_general_ci, user:root"] [command=Query] [status="inTxn:0, autocommit:1"] [sql="select count(1) from ios3"] [txn_mode=PESSIMISTIC] [timestamp=445150200404115457] [err="other error: Coprocessor task terminated due to exceeding the deadline
github.com/pingcap/tidb/pkg/store/copr.(*copIteratorWorker).handleCopResponse
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/store/copr/coprocessor.go:1408
github.com/pingcap/tidb/pkg/store/copr.(*copIteratorWorker).handleCopPagingResult
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/store/copr/coprocessor.go:1342
github.com/pingcap/tidb/pkg/store/copr.(*copIteratorWorker).handleTaskOnce
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/store/copr/coprocessor.go:1275
github.com/pingcap/tidb/pkg/store/copr.(*copIteratorWorker).handleTask
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/store/copr/coprocessor.go:1130
github.com/pingcap/tidb/pkg/store/copr.(*copIteratorWorker).run
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/store/copr/coprocessor.go:817
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1650"]
tikv log
[2023/10/24 02:08:31.809 +00:00] [INFO] [endpoint.rs:575] ["the max gap of leader resolved-ts is large"] [last_resolve_attempt=None] [duration_to_last_update_safe_ts=10896ms] [min_memory_lock=None] [txn_num=0] [loc
k_num=0] [min_lock=None] [applied_index=13] [read_state="ReadState { idx: 11, ts: 445150205764960261 }"] [gap=39079ms] [region_id=4777745] [thread_id=0x5]
[2023/10/24 02:08:32.298 +00:00] [INFO] [tracker.rs:269] [slow-query] [perf_stats.internal_delete_skipped_count=0] [perf_stats.internal_key_skipped_count=136006302] [perf_stats.block_read_byte=2327560933] [perf_sta
ts.block_read_count=240904] [perf_stats.block_cache_hit_count=13081] [scan.range.first="Some(start: 7480000000000F0EF65F69800000000000000200 end: 7480000000000F0EF65F698000000000000002FB)"] [scan.ranges=160] [scan.
total=136006411] [scan.processed_size=7576750080] [scan.processed=132925440] [scan.is_desc=false] [tag=index] [table_id=986870] [txn_start_ts=445150200404115457] [total_suspend_time=227.053597ms] [total_process_tim
e=59.772170009s] [handler_build_time=87.327µs] [wait_time.snapshot=23.879µs] [wait_time.schedule=21.859µs] [wait_time=45.738µs] [total_lifetime=59.999362611s] [remote_host=ipv4:10.200.9.66:45296] [region_id=4690463
] [session_alias=] [connection_id=553656796] [thread_id=0x5]
[2023/10/24 02:08:32.298 +00:00] [WARN] [endpoint.rs:850] [error-response] [err="Coprocessor task terminated due to exceeding the deadline"] [thread_id=0x5]
[2023/10/24 02:08:32.298 +00:00] [WARN] [tracker.rs:449] ["query deadline exceeded"] [tag=index] [table_id=986870] [txn_start_ts=445150200404115457] [total_suspend_time=227.053597ms] [total_process_time=59.772170009s] [handler_build_time=87.327µs] [wait_time.snapshot=23.879µs] [wait_time.schedule=21.859µs] [wait_time=45.738µs] [total_lifetime=59.99943366s] [remote_host=ipv4:10.200.9.66:45296] [region_id=4690463] [session_alias=] [connection_id=553656796] [current_stage=Tracked] [thread_id=0x5]
4. What is your TiDB version? (Required)
master
[10:31:48]TiDB root:test> select type,version,git_hash from information_schema.cluster_info;
+------+-------------+------------------------------------------+
| type | version | git_hash |
+------+-------------+------------------------------------------+
| tidb | 7.5.0-test | db0d44cbda83be81128dcc9d02dc0b8d9108873a |
| pd | 7.5.0-alpha | 4176c1daac69d00aec0512bd1e362b46fdebdd9a |
| tikv | 7.5.0-alpha | 9fb1ce63a079cd486f0fc4661ff28abb76d0e734 |
We should redirect this issue to tikv https://github.com/tikv/tikv/issues
@tiancaiamao This seems to be a TiDB issue, seems that it does not arrange tasks properly when there are a lot of partitions, e.g. lack of rate limiting or job backoffing.
Change it to 'sig/execution' since this is a copr-related issue. /cc @zanmato1984
/remove-type bug
/type enhancement
/remove-severity major
/remove-label affects-8.1
/remove-label may-affects-5.3
@yibin87: The label(s) may-affects-5.3 cannot be applied. These labels are supported: fuzz/sqlancer, challenge-program, compatibility-breaker, first-time-contributor, contribution, good first issue, correctness, duplicate, proposal, security, ok-to-test, needs-ok-to-test, needs-more-info, needs-cherry-pick-release-5.4, needs-cherry-pick-release-6.1, needs-cherry-pick-release-6.5, needs-cherry-pick-release-7.1, needs-cherry-pick-release-7.5, needs-cherry-pick-release-8.1, affects-5.4, affects-6.1, affects-6.5, affects-7.1, affects-7.5, affects-8.1, may-affects-5.4, may-affects-6.1, may-affects-6.5, may-affects-7.1, may-affects-7.5, may-affects-8.1.
In response to this:
/remove-label may-affects-5.3
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.
/remove-label may-affects-5.4
/remove-label may-affects-6.1
/remove-label may-affects-6.5
/remove-label may-affects-7.1
/remove-label may-affects-7.5