matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Bug]: Load data daily regression on tke report 'stream closed'.

Open Ariznawlll opened this issue 1 year ago • 13 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

Branch Name

main

Commit ID

b72bea5f640f6d07fbabc5437c7d64f48d020be5

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

job url: https://github.com/matrixorigin/mo-nightly-regression/actions/runs/7462446702/job/20317926227 image

mo-log:http://175.178.192.213:30088/explore?panes=%7B%22LwR%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-nightly-regression-20240109%5C%22%7D%20%7C%3D%20%60stream%20closed%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%22now-24h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1

二分出问题的commit:594ad8b365da45fb27f755489f6bd01093ac1a7a 企业微信截图_d1c29220-74cc-4d57-bcf3-1be0b4f8c225

Expected Behavior

No response

Steps to Reproduce

daily regression on tke.

Additional information

No response

Ariznawlll avatar Jan 10 '24 07:01 Ariznawlll

又跑了两遍,并没有出现这个问题。成功跑完load的流程。go version为1.21.5,os为ubuntu 22.04.3 image https://github.com/matrixorigin/mo-nightly-regression/actions/runs/7473592676 https://github.com/matrixorigin/mo-nightly-regression/actions/runs/7474357006/job/20340694720 然后观察上午binary search时启动的mo集群的日志,发现在失败的时候一个cn pod发生了段错误 企业微信截图_b29ca77d-8108-42dd-917f-93e0e0d19e44 日志:http://175.178.192.213:30088/explore?panes=%7B%22LwR%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-nightly-regression-20240110%5C%22,%20pod%3D%5C%22nightly-regression-dis-tp-cn-2lfjb%5C%22%7D%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221704864342196%22,%22to%22:%221704864559196%22%7D%7D%7D&schemaVersion=1&orgId=1 @nnsgmsone 帮忙看看呢

guguducken avatar Jan 10 '24 12:01 guguducken

段错误的话。需要daily每次运行的时候都把core dump开启一下,这样遇到端错误就可以直接查看core dump来查看问题了。 @Ariznawlll @guguducken 可以开启一下core dump吗

nnsgmsone avatar Jan 12 '24 03:01 nnsgmsone

等待复现后的core dump

nnsgmsone avatar Jan 17 '24 10:01 nnsgmsone

等待复现后的core dump

nnsgmsone avatar Jan 22 '24 10:01 nnsgmsone

等待复现后的core dump

nnsgmsone avatar Jan 25 '24 10:01 nnsgmsone

等待复现后的core dump

nnsgmsone avatar Jan 30 '24 10:01 nnsgmsone

等待复现后的core dump

nnsgmsone avatar Feb 02 '24 10:02 nnsgmsone

no process

nnsgmsone avatar Feb 21 '24 13:02 nnsgmsone

no process

nnsgmsone avatar Feb 26 '24 10:02 nnsgmsone

处理数据正确性问题中

nnsgmsone avatar Feb 29 '24 10:02 nnsgmsone

no process

nnsgmsone avatar Mar 05 '24 10:03 nnsgmsone

no process

nnsgmsone avatar Mar 08 '24 10:03 nnsgmsone

no process

nnsgmsone avatar Mar 13 '24 10:03 nnsgmsone

no process

nnsgmsone avatar Mar 18 '24 11:03 nnsgmsone

no process

nnsgmsone avatar Mar 21 '24 10:03 nnsgmsone

It hasn't been reproduced for a long time; closed.

sukki37 avatar Mar 26 '24 09:03 sukki37