matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Bug]: mo_cdc: resume task caused cn oom

Open heni02 opened this issue 1 year ago • 3 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

Branch Name

main

Commit ID

72b1061

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

resume 5亿+行数据(两列)cn oom,oom时间:10-22 20:57:00 https://shanghai.idc.matrixorigin.cn:30001/d/cluster-detail-namespaced/cluster-detail-namespaced?orgId=1&var-namespace=mo-cdc-test&var-account=All&var-interval=$__auto_interval_interval&var-cluster=.%2A&var-loki=loki&from=1729594986042&to=1729649266720 企业微信截图_a76491a6-7442-4b75-bc96-d3d00646efe9

10-22 20:56:20 和10-22 20:56:50两个时间段的profile: hn_download.zip

10-22 20:45:54 profile: CN_61396432-6439-6336-3065-363037363263_malloc_0192b443-2f30-7a97-9be7-f53e07d8c21e.gz CN_61396432-6439-6336-3065-363037363263_heap_0192b442-b79c-795d-98fa-f9160ea44dc3.gz

grafana profile相关监控: https://shanghai.idc.matrixorigin.cn:30001/explore?panes=%7B%22t-5%22:%7B%22datasource%22:%22pyroscope%22,%22queries%22:%5B%7B%22groupBy%22:%5B%5D,%22labelSelector%22:%22%7Bnamespace%3D%5C%22mo-cdc-test%5C%22,pod%3D%5C%22stability-regression-dis-tp-cn-9r7tg%5C%22%7D%22,%22queryType%22:%22both%22,%22refId%22:%22A%22,%22profileTypeId%22:%22memory:alloc_objects:count:space:bytes%22,%22datasource%22:%7B%22type%22:%22grafana-pyroscope-datasource%22,%22uid%22:%22pyroscope%22%7D%7D%5D,%22range%22:%7B%22from%22:%221729595441208%22,%22to%22:%221729649230080%22%7D%7D%7D&schemaVersion=1&orgId=1

Expected Behavior

No response

Steps to Reproduce

1.上下游创建表ddl:create table test01(a int auto_increment primary key,b int);
2.上游数据插入数据./start.sh -h 10.222.6.6 -b test_db -c cases/ddl/
3.创建同步数据任务./mo_cdc task create --task-name "cdc_resume" --source-uri="mysql://dump:[email protected]:6001" --sink-type="mysql" --sink-uri="mysql://dump:[email protected]:3306"    --tables='test_db.test01:back_ac1_db.test01' --level="account"  --account="sys"
3.下游有数据同步后过几分钟暂停:./mo_cdc task pause --task-name "cdc_resume" --source-uri="mysql://dump:[email protected]:6001"
./mo_cdc task show --task-name "cdc_resume" --source-uri="mysql://dump:[email protected]:6001"
4.上游数据达到5亿+数据后resume任务:./mo_cdc task resume --task-name "cdc_resume" --source-uri="mysql://dump:[email protected]:6001"

Additional information

No response

heni02 avatar Oct 23 '24 03:10 heni02

是这个问题:https://github.com/matrixorigin/matrixone/issues/19488

reusee avatar Oct 23 '24 06:10 reusee

fixed

reusee avatar Oct 23 '24 07:10 reusee

今晚待验证

heni02 avatar Oct 23 '24 09:10 heni02

#19378相同场景验证后,没有oom commit:27fcb95 confirm,closed

heni02 avatar Oct 25 '24 16:10 heni02