matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Bug]: There was a big difference about tpcc 100-1000 between running alone and running daily regression.

Open Ariznawlll opened this issue 1 year ago • 13 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

Branch Name

main

Commit ID

6ff08d6cff30aa00869325504631f2d5d6ef1ae0

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

tpcc 100 1000位于daily regression流程中: job url: https://github.com/matrixorigin/mo-nightly-regression/actions/runs/8709270996/job/23892311885

image

profile:
2024-04-16_19_13_43.zip

单tpcc 100 1000流程: job url: https://github.com/matrixorigin/mo-nightly-regression/actions/runs/8716115767/job/23909381912

image

profile: 2024-04-17_12_34_18.zip

Expected Behavior

No response

Steps to Reproduce

trigger daily regression and binary search regression on tke.

Additional information

No response

Ariznawlll avatar Apr 17 '24 04:04 Ariznawlll

why this is a bug? @Ariznawlll

zhangxu19830126 avatar Apr 18 '24 05:04 zhangxu19830126

The same resources running mixed complexity will definitely perform better than running a single load. Our memory and disk cache, then, pollute each other. This is not a bug.

zhangxu19830126 avatar Apr 18 '24 06:04 zhangxu19830126

In process

Ariznawlll avatar Apr 19 '24 11:04 Ariznawlll

In process

Ariznawlll avatar Apr 24 '24 10:04 Ariznawlll

In process

Ariznawlll avatar Apr 30 '24 07:04 Ariznawlll

In process

Ariznawlll avatar May 06 '24 03:05 Ariznawlll

0508 测试结果: commit id:6e94513897f3a739602429e5a61d7c9127fd9a2a 单独跑tpcc 500-1000: job URL:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/8995415575/job/24710552822 image

混合跑tpcc 500-1000: job url:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/8951239289/job/24587949571 image

Ariznawlll avatar May 08 '24 04:05 Ariznawlll

单独跑: job url:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/9026433240/job/24804803968 image

混合跑: job url:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/9019459807/job/24803217074 image

profile太大了,单独联系我

Ariznawlll avatar May 10 '24 05:05 Ariznawlll

单独跑的profile image

混合跑的profile image

1、fs.s3fs的读取耗时大很多 2、gc耗时大很多

ouyuanning avatar May 14 '24 02:05 ouyuanning

乐声先帮忙看看吧 profile文件找我拿一下吧,发不上去

ouyuanning avatar May 14 '24 02:05 ouyuanning

S3FS开销多,是因为需要多读数据。

不同负载,会有不同的IO,不同的缓存环境,这个看不出来有什么不符合预期的。

reusee avatar May 14 '24 05:05 reusee

越跑越慢是已知问题 main和1.2已经有相关修复,不确定对这个场景有没有优化

@Ariznawlll 这个数据能在main或者1.2上再取一次吗?

reusee avatar May 14 '24 06:05 reusee

working on other issues

reusee avatar May 17 '24 12:05 reusee

working on other issues

reusee avatar May 24 '24 12:05 reusee

working on other issues

reusee avatar May 29 '24 15:05 reusee

working on other issues

reusee avatar Jun 03 '24 10:06 reusee

@Ariznawlll 现在还是这样吗?

reusee avatar Jun 06 '24 10:06 reusee

无进展

reusee avatar Jun 11 '24 14:06 reusee

working on other issues

reusee avatar Jun 15 '24 10:06 reusee

这个现象怀疑也和goroutine泄露有关

reusee avatar Jun 20 '24 10:06 reusee

等相关pr合并之后,再看有没有类似现象

reusee avatar Jun 25 '24 14:06 reusee

如上

reusee avatar Jun 29 '24 10:06 reusee

无进展

reusee avatar Jul 05 '24 14:07 reusee

working on other issues

reusee avatar Jul 10 '24 13:07 reusee

无进展

reusee avatar Jul 15 '24 10:07 reusee

无进展

reusee avatar Jul 18 '24 10:07 reusee

working on other issues.

reusee avatar Jul 23 '24 10:07 reusee

working on other issues.

reusee avatar Jul 26 '24 12:07 reusee

working on other issues.

reusee avatar Jul 31 '24 10:07 reusee

working on other issues.

reusee avatar Aug 06 '24 10:08 reusee