matrixone
matrixone copied to clipboard
[Bug]: Stability test tpcc reported error:District for W_ID = 1 and D_ID = 4 not found
Is there an existing issue for the same bug?
- [X] I have checked the existing issues.
Branch Name
main
Commit ID
d50211bca84238c88eae7028a67fe7fb14e859c1
Other Environment Information
- Hardware parameters:
- OS type:
- Others:
Actual Behavior
稳定性测试是一种sysbench10万,tpcc 10仓10,tpch100G混合场景长时间的测试,该错误是在tpcc测试时出现的错误
mo log: http://10.222.6.1/explore?panes=%7B%2222b%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-stability-regression%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221703063205000%22,%22to%22:%221703063208000%22%7D%7D%7D&schemaVersion=1&orgId=1
mem use: http://10.222.6.1/d/85a562078cdf77779eaa1add43ccec1e/kubernetes-compute-resources-namespace-pods?orgId=1&from=1703041729328&to=1703074206027&var-datasource=prometheus&var-cluster=&var-namespace=mo-stability-regression
Expected Behavior
No response
Steps to Reproduce
稳定性测试
Additional information
No response
看上去是 out of memory 导致的
错误是主动抛出的,不是oom。
mpool泄漏导致的,统计的alloc越来越大。当超过global限制时,所有需要分配内存的动作都会返回该错误,因此无法正常执行sql语句。
我觉得这个不是一个简单能解决的问题。
与 #13740 相同,可以关闭该issue。 @heni02
update on 1.15:
Previous DML conflicts with existing constraints or data format. This transaction has to be aborted 2024-01-15 23:05:59 FATAL jTPCCTerminal:212 - [UNEXPECTED][TT_PAYMENT][EXECUTION] ErrorCode : 3015, ErrorMessage : error: out of memory Previous DML conflicts with existing constraints or data format. This transaction has to be aborted 2024-01-15 23:05:59 FATAL jTPCCTerminal:325 - [UNEXPECTED][TT_NEW_ORDER][EXECUTION] ErrorCode : 3015, ErrorMessage : error: out of memory Previous DML conflicts with existing constraints or data format. This transaction has to be aborted 2024-01-15 23:05:59 FATAL jTPCCTerminal:212 - [UNEXPECTED][TT_PAYMENT][EXECUTION] ErrorCode : 3015, ErrorMessage : error: out of memory Previous DML conflicts with existing constraints or data format. This transaction has to be aborted 2024-01-15 23:05:59 FATAL jTPCCTerminal:325 - [UNEXPECTED][TT_NEW_ORDER][EXECUTION] ErrorCode : 3015, ErrorMessage : error: out of memory Previous DML conflicts with existing constraints or data format. This transaction has to be aborted 2024-01-15 23:05:59 FATAL jTPCCTerminal:325 - [UNEXPECTED][TT_NEW_ORDER][EXECUTION] ErrorCode : 9999, ErrorMessage : Warehouse or Customer for W_ID = 1 and D_ID = 3 and C_ID= 387 not found. 2024-01-15 23:05:59 FATAL jTPCCTerminal:294 - [UNEXPECTED][DELIVERY_BG][EXECUTION] ErrorCode : 3015, ErrorMessage : error: out of memory Previous DML conflicts with existing constraints or data format. This transaction has to be aborted
该错误是测试工具报错。本质上是某个查询预期一定得到结果,结果该查询或许返回了空结果所导致的。
根本原因是mpool输出oom报错信息,此时cn已无法对外提供服务,但错误没有被处理,导致查询继续进行,从而输出空结果。
分析同上一个评论。未开始解决。 等各个mpool oom的问题解决了才能彻底处理这个问题。
参考 13740
近期无法投入
近期没有投入这个
同上一个回复
This issue has been inactive. Closing to keep the tracker clean. Reopen if still relevant. Thanks!