matrixone [Bug]: sysbench/tpcc perf continuously decrease duaring oltp high concurrency test.

Is there an existing issue for the same bug?

[X] I have checked the existing issues.

Environment

- Version or commit-id (e.g. v0.1.0 or 8b23a93):27c0e5e2ada81d945d858e2f57c04c7568268afb
- Hardware parameters:
- OS type:
- Others:

Actual Behavior

1、insert/update perf continuously decrease duaring oltp high concurrency test.

2、the perf of point_select diff very much on different time for the same mo. eg. first: second:

and between the time , some insert and update test had been executed.

please assign this issue to @XuPeng-SH

Expected Behavior

No response

Steps to Reproduce

No response

Additional information

No response

May 16 '23 03:05 aressu1985

@aptend storage optimization with better merge policy

May 16 '23 07:05 XuPeng-SH

Progress:

env: AMD EPYC 7K83 64-Core Processor 2.5GHz x 16 + 64G

baseline: 6c0e0c5ee89b16b36f547aee6aaa8f6e2341432c merge-1: https://github.com/aptend/matrixone/commit/6c78e28a5a61ff396d4cab3f989b9166130c4397 merge-2: merge-1 + merge compacted blocks in nonappendable segments.

process(vuser=100, 10min)	baseline	merge-1	merge-2
select-10-100000-prepare	13116 -> 12066	13760 -> 12340	14680 -> 12422
update--10-100000-prepare	523 -> 1053 -> 729	680 -> 1128 -> 789	647 -> 1093 -> 819
insert-10-100000-prepare	5213 -> 3968	5200 -> 4100	6242 -> 4171
select-10-100000-prepare	1789 -> 4188	5929 -> 7249	5628 -> 7320 ⭐️
update-10-100000-prepare	342 -> 720 -> 555	454 -> 825 -> 624	421 -> 763 -> 667
select-10-100000-prepare	2000 -> 3169	2856 -> 4017	3044 -> 5391 ⭐️
idle(15min)
select-10-100000-prepare	3956 - 4000	4399 - 4458	5888 - 5988 ⭐️

May 23 '23 08:05 aptend

env: AMD EPYC 7K83 64-Core Processor 2.5GHz x 16 + 64G

baseline: 6a65b65cecb30816d3565b2551308ab79b6c27cc merge-1: edee1e1c2dd75841858034eb5a8262126ac5ae68

process(vuser=100,10min)	baseline	merge-1
select-10-1000000-prepare	8000 -> 8690	10129 -> 8700
update-10-1000000-prepare	300 -> 858 -> 617	451 -> 929 -> 654
insert-10-1000000-prepare	4000 -> 1718	4777 -> 3557 ⭐️
select-10-1000000-prepare	1200 -> 5380 ⭐️	2222 -> 4568
update-10-1000000-prepare	300 -> 600 -> 486	371 -> 548 -> 461
select-10-1000000-prepare	1600 -> 4721 ⭐️	1815 -> 3353
Note:	30 merges	430 merges

Only insert benefits from constant merging, and the cost of merging is a concern...

May 24 '23 08:05 aptend

for phenomena 2 the perf of point_select diff very much on different time for the same mo

One possible cause is that Update will create much more blocks, making it longer to iterate through all blocks. I will trace the getBlockInfos method

Jun 01 '23 10:06 aptend

after one round of updating, the number of blocks will become 1000+, making BlockIter cost increase from 100us to 1ms

Jun 06 '23 10:06 aptend

I've given thee courtesy enough -- Hoarah Loux

设置：

去掉写入logservice过程
20并发
11th Gen Intel(R) Core(TM) i7-11700 @ 2.50GHz x 16 + 16 G
下图中左 2800 tps，右 4000 tps。

exec.Run 占比小，基本不变。

Commit 在去掉 flush 日志后，依然存在长尾波动，同时和 Compile 长尾更严重

Jun 12 '23 07:06 aptend

处理稳定性问题中

Jun 28 '23 10:06 aptend

在增加commit全流程耗时跟踪

Jul 03 '23 10:07 aptend

去除掉 logservice 写入后，目前看到的主要影响来自于 rpc 未及时 cancel 掉 message 中的 context，导致超时一瞬间 goroutine 太多，其它 goroutine 调度不及时。正在修改和测试

Jul 07 '23 10:07 aptend

现在 insert 性能下降得到了较大改善

Jul 14 '23 10:07 aptend

悲观事务debug和flush、merge优化中

Jul 19 '23 13:07 aptend

处理关联的 https://github.com/matrixorigin/matrixone/issues/10527

Aug 18 '23 10:08 aptend

添加 metric 中

Nov 13 '23 10:11 aptend

重构merge相关

Nov 16 '23 10:11 aptend

其中一个 case 是 #12775，sysbench delete 慢

Nov 21 '23 10:11 aptend

单机测试 sysbench 100w 100 并发 15 min，未再出现性能下降

[ 10s ] thds: 100 tps: 4508.63 qps: 4508.63 ... 
[ 20s ] thds: 100 tps: 4787.31 qps: 4787.31 ... 
[ 30s ] thds: 100 tps: 4700.33 qps: 4700.33 ... 
[ 40s ] thds: 100 tps: 4497.15 qps: 4497.15 ... 
[ 50s ] thds: 100 tps: 4280.30 qps: 4280.30 ... 
[ 60s ] thds: 100 tps: 4200.12 qps: 4200.12 ... 
[ 70s ] thds: 100 tps: 4080.87 qps: 4080.87 ... 
[ 80s ] thds: 100 tps: 4379.09 qps: 4379.09 ... 
[ 90s ] thds: 100 tps: 4547.70 qps: 4547.70 ... 
[ 100s ] thds: 100 tps: 4414.33 qps: 4414.33 ... 
[ 110s ] thds: 100 tps: 4376.33 qps: 4376.33 ... 
[ 120s ] thds: 100 tps: 4315.47 qps: 4315.47 ... 
[ 130s ] thds: 100 tps: 4378.28 qps: 4378.28 ... 
[ 140s ] thds: 100 tps: 4424.54 qps: 4424.54 ... 
[ 150s ] thds: 100 tps: 4262.74 qps: 4262.74 ... 
[ 160s ] thds: 100 tps: 4269.99 qps: 4269.99 ... 
[ 170s ] thds: 100 tps: 4432.43 qps: 4432.43 ... 
[ 180s ] thds: 100 tps: 4394.48 qps: 4394.48 ... 
[ 190s ] thds: 100 tps: 4272.31 qps: 4272.31 ... 
[ 200s ] thds: 100 tps: 4401.83 qps: 4401.83 ... 
[ 210s ] thds: 100 tps: 4382.80 qps: 4382.80 ... 
[ 220s ] thds: 100 tps: 4616.85 qps: 4616.85 ... 
[ 230s ] thds: 100 tps: 4616.98 qps: 4616.98 ... 
[ 240s ] thds: 100 tps: 4535.14 qps: 4535.14 ... 
[ 250s ] thds: 100 tps: 4372.20 qps: 4372.20 ... 
[ 260s ] thds: 100 tps: 4428.16 qps: 4428.16 ... 
[ 270s ] thds: 100 tps: 4297.89 qps: 4297.89 ... 
[ 280s ] thds: 100 tps: 4312.50 qps: 4312.50 ... 
[ 290s ] thds: 100 tps: 4411.91 qps: 4411.91 ... 
[ 300s ] thds: 100 tps: 5032.67 qps: 5032.67 ... 
[ 310s ] thds: 100 tps: 5352.36 qps: 5352.36 ... 
[ 320s ] thds: 100 tps: 5845.58 qps: 5845.58 ... 
[ 330s ] thds: 100 tps: 5693.32 qps: 5693.32 ... 
[ 340s ] thds: 100 tps: 5875.50 qps: 5875.50 ... 
[ 350s ] thds: 100 tps: 5755.33 qps: 5755.33 ... 
[ 360s ] thds: 100 tps: 5824.79 qps: 5824.79 ... 
[ 370s ] thds: 100 tps: 5795.73 qps: 5795.73 ... 
[ 380s ] thds: 100 tps: 5716.80 qps: 5716.80 ... 
[ 390s ] thds: 100 tps: 5592.50 qps: 5592.50 ... 
[ 400s ] thds: 100 tps: 5582.21 qps: 5582.21 ... 
[ 410s ] thds: 100 tps: 5614.02 qps: 5614.02 ... 
[ 420s ] thds: 100 tps: 5689.36 qps: 5689.36 ... 
[ 430s ] thds: 100 tps: 5718.97 qps: 5718.97 ... 
[ 440s ] thds: 100 tps: 5709.59 qps: 5709.59 ... 
[ 450s ] thds: 100 tps: 5858.09 qps: 5858.09 ... 
[ 460s ] thds: 100 tps: 5848.95 qps: 5848.95 ... 
[ 470s ] thds: 100 tps: 5972.44 qps: 5972.44 ... 
[ 480s ] thds: 100 tps: 5831.00 qps: 5831.00 ... 
[ 490s ] thds: 100 tps: 5877.16 qps: 5877.16 ... 
[ 500s ] thds: 100 tps: 5708.42 qps: 5708.42 ... 
[ 510s ] thds: 100 tps: 5644.83 qps: 5644.83 ... 
[ 520s ] thds: 100 tps: 5702.69 qps: 5702.69 ... 
[ 530s ] thds: 100 tps: 5687.69 qps: 5687.69 ... 
[ 540s ] thds: 100 tps: 5852.80 qps: 5852.80 ... 
[ 550s ] thds: 100 tps: 5823.04 qps: 5823.04 ... 
[ 560s ] thds: 100 tps: 5821.09 qps: 5821.09 ... 
[ 570s ] thds: 100 tps: 5764.28 qps: 5764.28 ... 
[ 580s ] thds: 100 tps: 5579.16 qps: 5579.16 ... 
[ 590s ] thds: 100 tps: 5684.53 qps: 5684.53 ... 
[ 600s ] thds: 100 tps: 5702.38 qps: 5702.38 ... 
[ 610s ] thds: 100 tps: 5823.26 qps: 5823.26 ... 
[ 620s ] thds: 100 tps: 5662.75 qps: 5662.75 ... 
[ 630s ] thds: 100 tps: 5594.60 qps: 5594.60 ... 
[ 640s ] thds: 100 tps: 5459.33 qps: 5459.33 ... 
[ 650s ] thds: 100 tps: 5525.19 qps: 5525.19 ... 
[ 660s ] thds: 100 tps: 5619.06 qps: 5619.06 ... 
[ 670s ] thds: 100 tps: 5604.78 qps: 5604.78 ... 
[ 680s ] thds: 100 tps: 5618.56 qps: 5618.56 ... 
[ 690s ] thds: 100 tps: 5560.80 qps: 5560.80 ... 
[ 700s ] thds: 100 tps: 5647.27 qps: 5647.27 ... 
[ 710s ] thds: 100 tps: 5601.89 qps: 5601.89 ... 
[ 720s ] thds: 100 tps: 5454.79 qps: 5454.79 ... 
[ 730s ] thds: 100 tps: 5470.28 qps: 5470.28 ... 
[ 740s ] thds: 100 tps: 5690.62 qps: 5690.62 ... 
[ 750s ] thds: 100 tps: 5512.13 qps: 5512.13 ... 
[ 760s ] thds: 100 tps: 5456.98 qps: 5456.98 ... 
[ 770s ] thds: 100 tps: 5521.98 qps: 5521.98 ... 
[ 780s ] thds: 100 tps: 5522.36 qps: 5522.36 ... 
[ 790s ] thds: 100 tps: 5474.18 qps: 5474.18 ... 
[ 800s ] thds: 100 tps: 5354.53 qps: 5354.53 ... 
[ 810s ] thds: 100 tps: 5866.10 qps: 5866.10 ... 
[ 820s ] thds: 100 tps: 5818.19 qps: 5818.19 ... 
[ 830s ] thds: 100 tps: 5843.82 qps: 5843.82 ... 
[ 840s ] thds: 100 tps: 5859.66 qps: 5859.66 ... 
[ 850s ] thds: 100 tps: 5878.11 qps: 5878.11 ... 
[ 860s ] thds: 100 tps: 5704.24 qps: 5704.24 ... 
[ 870s ] thds: 100 tps: 5828.77 qps: 5828.77 ... 
[ 880s ] thds: 100 tps: 5869.70 qps: 5869.70 ... 
[ 890s ] thds: 100 tps: 5835.08 qps: 5835.08 ... 
[ 900s ] thds: 100 tps: 5804.33 qps: 5804.33 ...

Nov 24 '23 03:11 aptend

继续优化 reader deletes 的收集

Nov 29 '23 10:11 aptend

reader deletes 修改已合入，待观察

Dec 05 '23 10:12 aptend

这个需要长期的性能优化，推到1.2

Dec 08 '23 04:12 XuPeng-SH

一开始所有的block都没有deletes,随着update的增多，越来越多的block都会有delta loc,也就是tombstone, 那读数据就需要额外读tombstone，然后apply delete, 这也是开销越来越大的原因

Apr 01 '24 11:04 XuPeng-SH

在 128 机器上，memory cache 32g 时，tpcc 下降速度远远小于默认配置。后续可以尝试去掉io干扰后看下降点

Apr 08 '24 10:04 aptend

三表改造和其他需求

May 22 '24 10:05 aptend

其他ISSUE跟踪，该ISSUE关闭

May 22 '24 13:05 aressu1985