matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Bug]: mo_cdc: mo reported panic error

Open heni02 opened this issue 1 year ago • 3 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

Branch Name

main

Commit ID

b4ea38bcd

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

panic: runtime error: index out of range [0] with length 0

goroutine 46148 gp=0xc00ea6c8c0 m=12 mp=0xc000a80008 [running]: panic({0x49cb180?, 0xc02df55338?}) /usr/local/go/src/runtime/panic.go:804 +0x168 fp=0xc000277570 sp=0xc0002774c0 pc=0x47b268 runtime.goPanicIndex(0x0, 0x0) /usr/local/go/src/runtime/panic.go:115 +0x74 fp=0xc0002775b0 sp=0xc000277570 pc=0x43fa74 github.com/matrixorigin/matrixone/pkg/cdc.(*mysqlSinker).getDeleteRowBuf(0xc017e45400, {0x55e5518, 0xc018e8e6c0}) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/sinker.go:467 +0x3d9 fp=0xc000277638 sp=0xc0002775b0 pc=0x37deb59 github.com/matrixorigin/matrixone/pkg/cdc.(*mysqlSinker).sinkDelete(0xc017e45400, {0x55e5518, 0xc018e8e6c0}, 0xc0002778c8?) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/sinker.go:381 +0x291 fp=0xc000277710 sp=0xc000277638 pc=0x37de111 github.com/matrixorigin/matrixone/pkg/cdc.(*mysqlSinker).sinkTail(0xc017e45400, {0x55e5518, 0xc018e8e6c0}, 0x2?, 0xc01da7d620) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/sinker.go:307 +0x5c5 fp=0xc000277a00 sp=0xc000277710 pc=0x37dd7c5 github.com/matrixorigin/matrixone/pkg/cdc.(*mysqlSinker).Sink(0xc017e45400, {0x55e5518, 0xc018e8e6c0}, 0xc01ba67100) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/sinker.go:232 +0x63c fp=0xc000277b30 sp=0xc000277a00 pc=0x37dcc1c github.com/matrixorigin/matrixone/pkg/cdc.(*tableReader).readTableWithTxn(0xc018e8c6c0, {0x55e5518, 0xc018e8e6c0}, {0x56ae0c8, 0xc021c5ea08}, 0xc018d0d3e0, 0xc0202f4570) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:299 +0x8c4 fp=0xc000277dc8 sp=0xc000277b30 pc=0x37db864 github.com/matrixorigin/matrixone/pkg/cdc.(*tableReader).readTable(0xc018e8c6c0, {0x55e5518, 0xc018e8e6c0}, 0xc0202f4570) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:150 +0x226 fp=0xc000277eb8 sp=0xc000277dc8 pc=0x37dae26 github.com/matrixorigin/matrixone/pkg/cdc.(*tableReader).Run(0xc018e8c6c0, {0x55e5518, 0xc018e8e6c0}, 0xc0202f4570) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:109 +0x173 fp=0xc000277fb0 sp=0xc000277eb8 pc=0x37da953 github.com/matrixorigin/matrixone/pkg/frontend.(*CdcTask).addExecPipelineForTable.gowrap1() /go/src/github.com/matrixorigin/matrixone/pkg/frontend/cdc.go:1417 +0x31 fp=0xc000277fe0 sp=0xc000277fb0 pc=0x3876311 runtime.goexit({}) /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000277fe8 sp=0xc000277fe0 pc=0x484321 created by github.com/matrixorigin/matrixone/pkg/frontend.(*CdcTask).addExecPipelineForTable in goroutine 46114 /go/src/github.com/matrixorigin/matrixone/pkg/frontend/cdc.go:1417 +0x548

panic error log: panic.log

mo log: http://10.222.6.1/explore?panes=%7B%22VCp%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-cdc-test%5C%22%7D%20%7C%3D%20%60panic%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221728977605518%22,%22to%22:%221728984805518%22%7D%7D%7D&schemaVersion=1&orgId=1

Expected Behavior

No response

Steps to Reproduce

1500万行数据同步依次做了如下同步操作
[root@mo-srv-128 mo-backup]# ./mo_cdc task create --task-name "cdc_15k" --source-uri="mysql://dump:[email protected]:6001" --sink-type="mysql" --sink-uri="mysql://dump:[email protected]:3306"    --tables='test_db.orders_15k:test_cdc_db.ORDERS_15k' --level="account"  --account="sys"
OK
[root@mo-srv-128 mo-backup]# ./mo_cdc task pause --task-name "cdc_15k" --source-uri="mysql://dump:[email protected]:6001"
OK
[root@mo-srv-128 mo-backup]# ./mo_cdc task drop --task-name "cdc_15k" --source-uri="mysql://dump:[email protected]:6001"
OK
[root@mo-srv-128 mo-backup]# ./mo_cdc task create --task-name "cdc_15k" --source-uri="mysql://dump:[email protected]:6001" --sink-type="mysql" --sink-uri="mysql://dump:[email protected]:3306"    --tables='test_db.orders_15k:test_cdc_db.ORDERS_15k' --level="account"  --account="sys"
OK
[root@mo-srv-128 mo-backup]# ./mo_cdc task pause --task-name "cdc_15k" --source-uri="mysql://dump:[email protected]:6001"
OK
[root@mo-srv-128 mo-backup]# ./mo_cdc task resume --task-name "cdc_15k" --source-uri="mysql://dump:[email protected]:6001"
OK
[root@mo-srv-128 mo-backup]# ./mo_cdc task restart --task-name "cdc_15k" --source-uri="mysql://dump:[email protected]:6001"
Error: invalid configuration: restart CDC task failed: Error 20101 (HY000): internal error: Task cdc_15k status can not be change, now it is Running
[root@mo-srv-128 mo-backup]# ./mo_cdc task pause --task-name "cdc_15k" --source-uri="mysql://dump:[email protected]:6001"
OK
[root@mo-srv-128 mo-backup]# ./mo_cdc task restart --task-name "cdc_15k" --source-uri="mysql://dump:[email protected]:6001"
Error: invalid configuration: restart CDC task failed: Error 20101 (HY000): internal error: Task cdc_15k status can not be change, now it is PauseRequested
[root@mo-srv-128 mo-backup]# ./mo_cdc task restart --task-name "cdc_15k" --source-uri="mysql://dump:[email protected]:6001"
OK
[root@mo-srv-128 mo-backup]# ./mo_cdc task show --task-name "cdc_15k" --source-uri="mysql://dump:[email protected]:6001"
[
  {
    "task-id": "01928f3f-f6b6-7586-bd74-1d899ca83356",
    "task-name": "cdc_15k",
    "source-uri": "mysql://dump:******@10.222.6.6:6001",
    "sink-uri": "mysql://dump:******@10.222.1.129:3306",
    "state": "running",
    "checkpoint": "{\n  \"test_db.orders_15k\": 2024-10-15 08:49:01.380634649 +0000 UTC,\n}",
    "timestamp": "2024-10-15 08:49:02.106482668 +0000 UTC"
  }
]

Additional information

No response

heni02 avatar Oct 15 '24 09:10 heni02

sinker任务还没结束,cdc任务提前回收了sinker资源导致的,修复中

ck89119 avatar Oct 15 '24 09:10 ck89119

还是该commit,resume 7亿行全量数据时,cn panic重启,panic日志信息,确认下是否是同一问题 panic log: panic2.log

{"level":"INFO","time":"2024/10/15 10:59:48.937141 +0000","caller":"cdc/reader.go:95","msg":"cdc tableReader(test_db(272513).test01(272514) -> back_ac1_db.test01).Run: end"} panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x3233832]

goroutine 998531 gp=0xc018bb9180 m=3 mp=0xc0001d0e08 [running]: panic({0x44d4780?, 0x8056910?}) /usr/local/go/src/runtime/panic.go:804 +0x168 fp=0xc02a2e97a0 sp=0xc02a2e96f0 pc=0x47b268 runtime.panicmem(...) /usr/local/go/src/runtime/panic.go:262 runtime.sigpanic() /usr/local/go/src/runtime/signal_unix.go:900 +0x359 fp=0xc02a2e9800 sp=0xc02a2e97a0 pc=0x47d919 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.updateDataBatch(0x322eac0?, {0x1, 0x0, 0x0, 0x0, 0x29, 0x6d, 0x5b, 0x14, 0x46, ...}, ...) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:878 +0x12 fp=0xc02a2e9858 sp=0xc02a2e9800 pc=0x3233832 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(*AObjectHandle).getNextAObject(0xc029ea2780, {0x55e5518, 0xc05cc12e40}) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:295 +0x1f7 fp=0xc02a2e9980 sp=0xc02a2e9858 pc=0x322da77 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(*AObjectHandle).init(...) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:266 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.(*baseHandle).init(0xc06c12cda0, {0x55e5518?, 0xc05cc12e40?}, 0x1, 0xc00764cfc0) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:405 +0x36 fp=0xc02a2e99a8 sp=0xc02a2e9980 pc=0x322ebb6 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay.NewChangesHandler(0xc0162bf7c0, {0x1, 0x0, 0x0, 0x0, 0x29, 0x6d, 0x5b, 0x14, 0x46, ...}, ...) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/logtailreplay/change_handle.go:591 +0x293 fp=0xc02a2e9a38 sp=0xc02a2e99a8 pc=0x32303b3 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae.(*txnTable).CollectChanges(0xc029c62f00, {0x55e5518, 0xc05cc12e40}, {0x1, 0x0, 0x0, 0x0, 0x29, 0x6d, 0x5b, ...}, ...) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/change_handle.go:40 +0x105 fp=0xc02a2e9aa0 sp=0xc02a2e9a38 pc=0x3303925 github.com/matrixorigin/matrixone/pkg/vm/engine/disttae.(*txnTableDelegate).CollectChanges(0x0?, {0x55e5518?, 0xc05cc12e40?}, {0x1, 0x0, 0x0, 0x0, 0x29, 0x6d, 0x5b, ...}, ...) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/disttae/txn_table_sharding.go:164 +0x45 fp=0xc02a2e9ae8 sp=0xc02a2e9aa0 pc=0x335c945 github.com/matrixorigin/matrixone/pkg/cdc.init.func10({0x55e5518?, 0xc05cc12e40?}, {0x56a94a0?, 0xc058563440?}, {0x1, 0x0, 0x0, 0x0, 0x29, 0x6d, ...}, ...) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/util.go:620 +0x5c fp=0xc02a2e9b30 sp=0xc02a2e9ae8 pc=0x37da51c github.com/matrixorigin/matrixone/pkg/cdc.(*tableReader).readTableWithTxn(0xc0369d42d0, {0x55e5518, 0xc05cc12e40}, {0x56ae0c8, 0xc029c6ca08}, 0xc0554e4c30, 0xc022b5e4b0) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:178 +0x31d fp=0xc02a2e9dc8 sp=0xc02a2e9b30 pc=0x37db2bd github.com/matrixorigin/matrixone/pkg/cdc.(*tableReader).readTable(0xc0369d42d0, {0x55e5518, 0xc05cc12e40}, 0xc022b5e4b0) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:150 +0x226 fp=0xc02a2e9eb8 sp=0xc02a2e9dc8 pc=0x37dae26 github.com/matrixorigin/matrixone/pkg/cdc.(*tableReader).Run(0xc0369d42d0, {0x55e5518, 0xc05cc12e40}, 0xc022b5e4b0) /go/src/github.com/matrixorigin/matrixone/pkg/cdc/reader.go:109 +0x173 fp=0xc02a2e9fb0 sp=0xc02a2e9eb8 pc=0x37da953 github.com/matrixorigin/matrixone/pkg/frontend.(*CdcTask).addExecPipelineForTable.gowrap1() /go/src/github.com/matrixorigin/matrixone/pkg/frontend/cdc.go:1417 +0x31 fp=0xc02a2e9fe0 sp=0xc02a2e9fb0 pc=0x3876311 runtime.goexit({}) /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc02a2e9fe8 sp=0xc02a2e9fe0 pc=0x484321 created by github.com/matrixorigin/matrixone/pkg/frontend.(*CdcTask).addExecPipelineForTable in goroutine 926 /go/src/github.com/matrixorigin/matrixone/pkg/frontend/cdc.go:1417 +0x548

heni02 avatar Oct 16 '24 02:10 heni02

和#19378相同问题,等待解决再回归

heni02 avatar Oct 23 '24 06:10 heni02

confirm,closed commit:27fcb95 Uploading 企业微信截图_e58217da-ed9f-4496-bd3f-2ec5316da5e6.png…

log:https://shanghai.idc.matrixorigin.cn:30001/explore?panes=%7B%22t-5%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-cdc-test%5C%22%7D%20%7C%3D%20%60panic%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221729845018000%22,%22to%22:%221729854729000%22%7D%7D%7D&schemaVersion=1&orgId=1

heni02 avatar Oct 25 '24 16:10 heni02