tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

v6.2.0 [PANIC] [sorter.go:306] ["resolved ts regression"]

Open Tammyxia opened this issue 2 years ago • 2 comments

What did you do?

  • Test case:cc_tikv_scale_sync

  • Workload: 60k table

  • Steps: scale-out tikv to 7 scale-in cdc from 6 to 4 scale-in tikv from 7 to 4

What did you expect to see?

No response

What did you see instead?

Test case log: [2022/08/02 06:15:46.743 +00:00] [INFO] [step.go:30] ["scaleTiDBCluster, target=tikv, replica=7, timeout=0s"] [2022/08/02 06:29:36.901 +00:00] [INFO] [tidb_cluster.go:103] ["scale tidb cluster success"] [option="scaleTiDBCluster, target=tikv, replica=7, timeout=0s"] [elapsed=13m50.158236255 s] [2022/08/02 06:29:36.901 +00:00] [INFO] [step.go:30] ["scaleTiDBCluster, target=ticdc, replica=4, timeout=0s"] [2022/08/02 06:32:12.041 +00:00] [INFO] [tidb_cluster.go:103] ["scale tidb cluster success"] [option="scaleTiDBCluster, target=ticdc, replica=4, timeout=0s"] [elapsed=2m35.139213438 s] [2022/08/02 06:32:12.041 +00:00] [INFO] [step.go:30] ["scaleTiDBCluster, target=tikv, replica=4, timeout=0s"]

upstream-ticdc-2 panic log: [2022/08/02 06:31:40.185 +00:00] [PANIC] [sorter.go:306] ["resolved ts regression"] [tableID=61568] [resolvedTs=435007491944480771] [oldResolvedTs=435007493255200775] [stack="github.com/pingcap/tiflow/cdc/processor/pipeline.(*sorterNode).handleRawEvent\n\tgithub.com/pingc

upstream-ticdc-1 panic log: [2022/08/02 07:29:41.982 +00:00] [PANIC] [sorter.go:306] ["resolved ts regression"] [tableID=61501] [resolvedTs=435008405739143171] [oldResolvedTs=435008406263431170] [stack="github.com/pingcap/tiflow/cdc/processor/pipeline.(*sorterNode).handleRawEvent\n\tgithub.com/pingc [2022/08/02 07:29:44.078 +00:00] [INFO] [helper.go:49] ["init log"] [file=/var/lib/ticdc/log/ticdc.log] [level=info] [2022/08/02 07:29:44.078 +00:00] [INFO] [version.go:47] ["Welcome to Change Data Capture (CDC)"] [release-version=v6.2.0] [git-hash=bd21a6ea5ce58b86fe139eddca8dc4436320ca60] [git-branch=heads/refs/tags/v6.2.0] [utc-build-time="2022-08-01 09:09:06"] [go-version="go version [2022/08/02 07:29:44.078 +00:00] [INFO] [server.go:90] ["creating CDC server"]

Versions of the cluster

Upstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

(paste TiDB cluster version here)

Upstream TiKV version (execute tikv-server --version):

(paste TiKV version here)

TiCDC version (execute cdc version):

(paste TiCDC version here)
Release Version: v6.2.0
Git Commit Hash: bd21a6ea5ce58b86fe139eddca8dc4436320ca60
Git Branch: heads/refs/tags/v6.2.0
UTC Build Time: 2022-08-01 09:09:06
Go Version: go version go1.18.2 linux/amd64
Failpoint Build: false

Tammyxia avatar Aug 02 '22 07:08 Tammyxia

/label affects-6.2

jebter avatar Aug 02 '22 10:08 jebter

If feature batch resolved ts is enabled, Region resolved timestmap can regress. It's expected. And it's allowed by Frontier. So I guess just remove the panic is fine enough.

hicqu avatar Aug 03 '22 08:08 hicqu

/assign @sdojjy

nongfushanquan avatar Sep 28 '22 02:09 nongfushanquan

Since this issue would not casue correctness problem on data consistence and it's hard to reproduce, so I'm going to adjust it to a severity/moderate issue.

asddongmen avatar Sep 28 '22 02:09 asddongmen

reopen this issue if happen again

sdojjy avatar Sep 30 '22 04:09 sdojjy