[Bug]: Ivf index benchmark QPS performance more than double the decrease
Is there an existing issue for the same bug?
- [x] I have checked the existing issues.
Branch Name
main
Commit ID
7dd7f9a3
Other Environment Information
- Hardware parameters:
- OS type:
- Others:
Actual Behavior
最新的main测试ivf sift128 benchmark QPS相比2023年9月测试结果性能降低了一倍多,recall基本相同
https://doc.weixin.qq.com/sheet/e3_AYYAgwazACsB4L72hnoQ1eAEU31N6?scode=AJsA6gc3AA8QPAY9JuAYYAgwazACs&tab=o01hkj
Ivf优化前pr(commit:3bc8106c3)测试发现QPS已经相比之前降低了一倍多
profile待补充
Expected Behavior
No response
Steps to Reproduce
ann-benchmark sift128 test
Additional information
No response
what is the probe limit? the old code default probe limit is 1 and new code probe limit is 5.
Here is the result from my computer, search time is faster than before.
For 500 lists, commit | build time | QPS 9bf1f1890 | 32.09 sec | 36.62 b56e311 | 30.32 sec | 25.0874 current | 16.32 sec | 43.89
For 2000 lists, commit | build time | QPS 9bf1f1890 | | b56e311 | 1 min 43.13 sec | 29.6222 current | 2 min 7.72 sec | 76.16
For 4000 lists, commit | build time | QPS 9bf1f1890 | | b56e311 | 3 min 23 sec | 26.6735 current | 7 min 33sec | 87.7254
details log:
old code (9bf1f1890)
build time,
mysql> create index idx3 using ivfflat on t3(b) lists=500 op_type "vector_l2_ops";
Query OK, 0 rows affected (32.09 sec)
Search time,
% python3 query_mo_pg.py
Query 501 completed in 0.0440s with recall: 0.9900 Recall: 0.9635, Total Duration: 15.4958s, Avg Latency: 0.0309, QPS: 36.6268
commit b56e311
build time, mysql> create index idx3 using ivfflat on t3(b) lists=500 op_type "vector_l2_ops";
Query OK, 0 rows affected (30.32 sec)
% python3 query_mo_pg.py
Query 501 completed in 0.0397s with recall: 0.9900 Recall: 0.9655, Total Duration: 19.9702s, Avg Latency: 0.0399, QPS: 25.0874
For 2000 lists,
mysql> create index idx3 using ivfflat on t3(b) lists=2000 op_type "vector_l2_ops"; Query OK, 0 rows affected (1 min 43.13 sec)
% python3 query_mo_pg.py
Query 501 completed in 0.0309s with recall: 0.9300 Recall: 0.8863, Total Duration: 16.9130s, Avg Latency: 0.0338, QPS: 29.6222
For 4000 lists,
mysql> create index idx3 using ivfflat on t3(b) lists=4000 op_type "vector_l2_ops"; Query OK, 0 rows affected (3 min 33.26 sec)
% python3 query_mo_pg.py
Query 501 completed in 0.0295s with recall: 0.7900 Recall: 0.8312, Total Duration: 18.7827s, Avg Latency: 0.0375, QPS: 26.6735
new code
mysql> create index idx3 using ivfflat on t3(b) lists=500 op_type "vector_l2_ops";
Query OK, 0 rows affected (16.32 sec)
% python3 query_mo_pg.py
Recall: 0.9616, Total Duration: 11.4130s, Avg Latency: 0.0228, QPS: 43.8973
2000 lists,
mysql> create index idx3 using ivfflat on t3(b) lists=2000 op_type "vector_l2_ops"; Query OK, 0 rows affected (2 min 7.72 sec)
% python3 query_mo_pg.py
Recall: 0.8598, Total Duration: 6.5781s, Avg Latency: 0.0131, QPS: 76.1618
4000 lists,
mysql> create index idx3 using ivfflat on t3(b) lists=4000 op_type "vector_l2_ops"; Query OK, 0 rows affected (7 min 33.65 sec)
Query 501 completed in 0.1081s with recall: 0.7600 Recall: 0.7947, Total Duration: 5.7110s, Avg Latency: 0.0114, QPS: 87.7254
Sometimes, after create index and call search, ivf search entries is slow. But after reboot and run search, ivf search entries is fast.
When slow, the waiting time for data is in total 1000ms. When fast, the waiting time for data is 100ms. 10x difference.
99% of the time is waiting for data. As long as we have the data faster, we can always improve the performance.
Long waiting time
probe nthread = 12, probe = 25... return centroid ids [2616 1610 1155 1241 2800 1895 413 2958 327 1049 2836 316 2934 590 2883 2386 45 2387 570 208 2580 1015 2612 2920 2594] timing: nthread = 12, find Centroids 911.917µs searchEntries: batch size 426, wait batch 5.202541ms, distance measure (batch) 96.209µs searchEntries: batch size 384, wait batch 9.877375ms, distance measure (batch) 81.375µs searchEntries: batch size 282, wait batch 14.454125ms, distance measure (batch) 57.625µs searchEntries: batch size 349, wait batch 18.967209ms, distance measure (batch) 81.208µs searchEntries: batch size 691, wait batch 23.696083ms, distance measure (batch) 158.459µs searchEntries: batch size 766, wait batch 28.22225ms, distance measure (batch) 174.875µs searchEntries: batch size 784, wait batch 32.7785ms, distance measure (batch) 171.333µs searchEntries: batch size 309, wait batch 37.271333ms, distance measure (batch) 66.084µs searchEntries: batch size 432, wait batch 41.778958ms, distance measure (batch) 95.875µs searchEntries: batch size 658, wait batch 46.448583ms, distance measure (batch) 156.417µs searchEntries: batch size 378, wait batch 50.935291ms, distance measure (batch) 82.709µs searchEntries: batch size 589, wait batch 55.061125ms, distance measure (batch) 123.958µs searchEntries: batch size 594, wait batch 53.903417ms, distance measure (batch) 136.208µs searchEntries: batch size 1365, wait batch 53.572416ms, distance measure (batch) 284.459µs searchEntries: batch size 847, wait batch 53.219208ms, distance measure (batch) 196.584µs searchEntries: batch size 1121, wait batch 52.884875ms, distance measure (batch) 228.958µs searchEntries: batch size 484, wait batch 52.163708ms, distance measure (batch) 116.709µs searchEntries: batch size 560, wait batch 51.7715ms, distance measure (batch) 121.792µs searchEntries: batch size 620, wait batch 51.430209ms, distance measure (batch) 137.791µs searchEntries: batch size 547, wait batch 51.171083ms, distance measure (batch) 117.584µs searchEntries: batch size 930, wait batch 50.797292ms, distance measure (batch) 195.833µs searchEntries: batch size 384, wait batch 50.223083ms, distance measure (batch) 100.5µs timing: nthread = 12, total batch = 22, total dist_ts = 2970 us, total wait_ts 885 ms, ivf search 99.074334ms timing: nthread = 12, heap n-best search 88.709µs
Short waiting time
probe nthread = 12, probe = 25... return centroid ids [670 1361 2374 1544 1215 287 2535 1424 1782 2074 2496 1651 1852 2975 1648 2215 818 1999 2969 62 2040 147 706 1246 2265] timing: nthread = 12, find Centroids 877.167µs searchEntries: batch size 761, wait batch 498.5µs, distance measure (batch) 177.458µs searchEntries: batch size 470, wait batch 616.208µs, distance measure (batch) 109.625µs searchEntries: batch size 241, wait batch 703.625µs, distance measure (batch) 55.417µs searchEntries: batch size 502, wait batch 842.625µs, distance measure (batch) 116.167µs searchEntries: batch size 395, wait batch 973.042µs, distance measure (batch) 84.708µs searchEntries: batch size 498, wait batch 1.080125ms, distance measure (batch) 116.083µs searchEntries: batch size 408, wait batch 1.191792ms, distance measure (batch) 91.292µs searchEntries: batch size 589, wait batch 1.542ms, distance measure (batch) 185.167µs searchEntries: batch size 373, wait batch 1.643625ms, distance measure (batch) 116.542µs searchEntries: batch size 1369, wait batch 1.414416ms, distance measure (batch) 371.959µs searchEntries: batch size 540, wait batch 1.76925ms, distance measure (batch) 120.625µs searchEntries: batch size 569, wait batch 1.385833ms, distance measure (batch) 166.875µs searchEntries: batch size 1225, wait batch 1.912833ms, distance measure (batch) 311.209µs searchEntries: batch size 413, wait batch 1.4365ms, distance measure (batch) 118.292µs searchEntries: batch size 284, wait batch 1.468583ms, distance measure (batch) 72.375µs searchEntries: batch size 1085, wait batch 1.5505ms, distance measure (batch) 252.959µs searchEntries: batch size 495, wait batch 1.483708ms, distance measure (batch) 105.5µs searchEntries: batch size 513, wait batch 1.456833ms, distance measure (batch) 124.042µs searchEntries: batch size 437, wait batch 1.469833ms, distance measure (batch) 105.292µs searchEntries: batch size 207, wait batch 1.253958ms, distance measure (batch) 56.292µs searchEntries: batch size 856, wait batch 1.184917ms, distance measure (batch) 220.333µs searchEntries: batch size 463, wait batch 1.351084ms, distance measure (batch) 102.541µs searchEntries: batch size 1036, wait batch 1.391875ms, distance measure (batch) 235.166µs searchEntries: batch size 570, wait batch 5.380333ms, distance measure (batch) 125.084µs searchEntries: batch size 503, wait batch 5.496708ms, distance measure (batch) 149.75µs searchEntries: batch size 357, wait batch 5.526125ms, distance measure (batch) 100.625µs timing: nthread = 12, total batch = 26, total dist_ts = 3780 us, total wait_ts 46 ms, ivf search 7.956833ms timing: nthread = 12, heap n-best search 40.875µs
Sometimes you can reproduce the bug with the following. Slow SQL always come right after CREATE INDEX. Usually CREATE INDEX twice can reproduce the bug.
To reproduce the bug,
create database a; use a; create table t3(a int primary key, b vecf32(128));
cd /data1/heni/mo-benchmark-test/sift128 /usr/local/python3/bin/python3.10 insert_mo.py
mysql> SET experimental_ivf_index = 1;
mysql> create index idx3 using ivfflat on t3(b) lists=4000 op_type "vector_l2_ops";
% python3 query_mo_pg.py
mysql> drop index idx3 on t3;
mysql> create index idx3 using ivfflat on t3(b) lists=4000 op_type "vector_l2_ops";
% python3 query_mo_pg.py
The SQL to run:
sql := fmt.Sprintf("SELECT `%s`, `%s` FROM `%s`.`%s` WHERE `%s` = %d AND `%s` IN (%s)",
catalog.SystemSI_IVFFLAT_TblCol_Entries_pk,
catalog.SystemSI_IVFFLAT_TblCol_Entries_entry,
tblcfg.DbName, tblcfg.EntriesTable,
catalog.SystemSI_IVFFLAT_TblCol_Entries_version,
idx.Version,
catalog.SystemSI_IVFFLAT_TblCol_Entries_id,
instr,
)
Same SIFT test but run Slow (CPU profile)
Recall: 0.8260, Total Duration: 53.0852s, Avg Latency: 0.1060, QPS: 9.4377
fast sql CPU profile Recall: 0.8260, Total Duration: 6.0623s, Avg Latency: 0.0121, QPS: 82.6419
heap profile for slow test
Same plan but different running time ( 143ms vs 12ms)
SQL:
SELECT __mo_index_pri_col, __mo_index_centroid_fk_entry FROM a.__mo_index_secondary_0195f094-f18e-7a0a-9a81-dca5bff76b21 WHERE __mo_index_centroid_fk_version = 0 AND __mo_index_centroid_fk_id IN (2517,2478,2230,128,936,2225,1979,1506,723,2782,2117,2780,64,1252,1510,1980,747,2789,1227,2489,2992,980,2070,1819,2595)
it is about cache. After reboot, run explain analyze will be around 150ms and it read from file. After that, the time is fast around 12ms which don't read from file.
but after create index. run explain analyze always read from file and slow and data never get cached.
set cache config in etc/launch/cn.toml will fix the issue. Default cache size is 512M.
[[fileservice]] name = "LOCAL" backend = "DISK"
[[fileservice]] name = "SHARED" backend = "DISK" data-dir = "mo-data/shared"
[fileservice.cache] memory-capacity = "8GB" disk-capacity = "8GB" disk-path = "mo-data/file-service-cache"
@cpegeric 当前问题环境cn配置:
目前main分支(最新commit)create index性能已经恢复到之前的水平,2.1-dev(09bb68d8a) create index还没有恢复,还是很慢(两个分支测试环境etc配置都相同) @cpegeric 麻烦看下
main :
2.1-dev:
slow create index may cause by the issue https://github.com/matrixorigin/matrixone/issues/21456
@cpegeric this is worthy of porting to 2.1 branch, unless it is HUGE change. Quote: "目前main分支(最新commit)create index性能已经恢复到之前的水平,2.1-dev(09bb68d8a) create index还没有恢复"
@fengttt there are more than 60 files changes. I think it is too much changes. Isn't it?
please verify after the 2.1-dev after code merged.
下一版本测试验证
main分支commit:fcbc898e63验证sift128和gist960 ivf qps较优化前有提升,closed