matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Bug]: Stability test tpcc reported error:District for W_ID = 1 and D_ID = 4 not found

Open heni02 opened this issue 1 year ago • 8 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

Branch Name

main

Commit ID

d50211bca84238c88eae7028a67fe7fb14e859c1

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

稳定性测试是一种sysbench10万,tpcc 10仓10,tpch100G混合场景长时间的测试,该错误是在tpcc测试时出现的错误 企业微信截图_217a893b-139b-4143-bf87-030f3546e9a4

mo log: http://10.222.6.1/explore?panes=%7B%2222b%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-stability-regression%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221703063205000%22,%22to%22:%221703063208000%22%7D%7D%7D&schemaVersion=1&orgId=1

mem use: http://10.222.6.1/d/85a562078cdf77779eaa1add43ccec1e/kubernetes-compute-resources-namespace-pods?orgId=1&from=1703041729328&to=1703074206027&var-datasource=prometheus&var-cluster=&var-namespace=mo-stability-regression

Expected Behavior

No response

Steps to Reproduce

稳定性测试

Additional information

No response

heni02 avatar Dec 22 '23 11:12 heni02

看上去是 out of memory 导致的

reusee avatar Dec 25 '23 02:12 reusee

错误是主动抛出的,不是oom。

mpool泄漏导致的,统计的alloc越来越大。当超过global限制时,所有需要分配内存的动作都会返回该错误,因此无法正常执行sql语句。

我觉得这个不是一个简单能解决的问题。

m-schen avatar Dec 25 '23 06:12 m-schen

与 #13740 相同,可以关闭该issue。 @heni02

m-schen avatar Dec 25 '23 06:12 m-schen

update on 1.15:

Previous DML conflicts with existing constraints or data format. This transaction has to be aborted 2024-01-15 23:05:59 FATAL jTPCCTerminal:212 - [UNEXPECTED][TT_PAYMENT][EXECUTION] ErrorCode : 3015, ErrorMessage : error: out of memory Previous DML conflicts with existing constraints or data format. This transaction has to be aborted 2024-01-15 23:05:59 FATAL jTPCCTerminal:325 - [UNEXPECTED][TT_NEW_ORDER][EXECUTION] ErrorCode : 3015, ErrorMessage : error: out of memory Previous DML conflicts with existing constraints or data format. This transaction has to be aborted 2024-01-15 23:05:59 FATAL jTPCCTerminal:212 - [UNEXPECTED][TT_PAYMENT][EXECUTION] ErrorCode : 3015, ErrorMessage : error: out of memory Previous DML conflicts with existing constraints or data format. This transaction has to be aborted 2024-01-15 23:05:59 FATAL jTPCCTerminal:325 - [UNEXPECTED][TT_NEW_ORDER][EXECUTION] ErrorCode : 3015, ErrorMessage : error: out of memory Previous DML conflicts with existing constraints or data format. This transaction has to be aborted 2024-01-15 23:05:59 FATAL jTPCCTerminal:325 - [UNEXPECTED][TT_NEW_ORDER][EXECUTION] ErrorCode : 9999, ErrorMessage : Warehouse or Customer for W_ID = 1 and D_ID = 3 and C_ID= 387 not found. 2024-01-15 23:05:59 FATAL jTPCCTerminal:294 - [UNEXPECTED][DELIVERY_BG][EXECUTION] ErrorCode : 3015, ErrorMessage : error: out of memory Previous DML conflicts with existing constraints or data format. This transaction has to be aborted

aressu1985 avatar Jan 17 '24 04:01 aressu1985

该错误是测试工具报错。本质上是某个查询预期一定得到结果,结果该查询或许返回了空结果所导致的。

根本原因是mpool输出oom报错信息,此时cn已无法对外提供服务,但错误没有被处理,导致查询继续进行,从而输出空结果。

m-schen avatar Jan 24 '24 10:01 m-schen

分析同上一个评论。未开始解决。 等各个mpool oom的问题解决了才能彻底处理这个问题。

m-schen avatar Feb 01 '24 11:02 m-schen

参考 13740

m-schen avatar May 17 '24 11:05 m-schen

近期无法投入

m-schen avatar Jul 03 '24 10:07 m-schen

近期没有投入这个

m-schen avatar Jul 08 '24 11:07 m-schen

同上一个回复

m-schen avatar Jul 11 '24 11:07 m-schen

This issue has been inactive. Closing to keep the tracker clean. Reopen if still relevant. Thanks!

sukki37 avatar Jul 16 '24 07:07 sukki37