matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Bug]: Ivf index benchmark QPS performance more than double the decrease

Open heni02 opened this issue 9 months ago • 17 comments

Is there an existing issue for the same bug?

  • [x] I have checked the existing issues.

Branch Name

main

Commit ID

7dd7f9a3

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

最新的main测试ivf sift128 benchmark QPS相比2023年9月测试结果性能降低了一倍多,recall基本相同 https://doc.weixin.qq.com/sheet/e3_AYYAgwazACsB4L72hnoQ1eAEU31N6?scode=AJsA6gc3AA8QPAY9JuAYYAgwazACs&tab=o01hkj Image

Ivf优化前pr(commit:3bc8106c3)测试发现QPS已经相比之前降低了一倍多

profile待补充

Expected Behavior

No response

Steps to Reproduce

ann-benchmark sift128 test

Additional information

No response

heni02 avatar Mar 25 '25 03:03 heni02

what is the probe limit? the old code default probe limit is 1 and new code probe limit is 5.

cpegeric avatar Mar 31 '25 08:03 cpegeric

Here is the result from my computer, search time is faster than before.

For 500 lists, commit | build time | QPS 9bf1f1890 | 32.09 sec | 36.62 b56e311 | 30.32 sec | 25.0874 current | 16.32 sec | 43.89

For 2000 lists, commit | build time | QPS 9bf1f1890 | | b56e311 | 1 min 43.13 sec | 29.6222 current | 2 min 7.72 sec | 76.16

For 4000 lists, commit | build time | QPS 9bf1f1890 | | b56e311 | 3 min 23 sec | 26.6735 current | 7 min 33sec | 87.7254

details log:

old code (9bf1f1890)

build time,

mysql> create index idx3 using ivfflat on t3(b) lists=500 op_type "vector_l2_ops";

Query OK, 0 rows affected (32.09 sec)

Search time,

% python3 query_mo_pg.py

Query 501 completed in 0.0440s with recall: 0.9900 Recall: 0.9635, Total Duration: 15.4958s, Avg Latency: 0.0309, QPS: 36.6268

commit b56e311

build time, mysql> create index idx3 using ivfflat on t3(b) lists=500 op_type "vector_l2_ops";

Query OK, 0 rows affected (30.32 sec)

% python3 query_mo_pg.py

Query 501 completed in 0.0397s with recall: 0.9900 Recall: 0.9655, Total Duration: 19.9702s, Avg Latency: 0.0399, QPS: 25.0874

For 2000 lists,

mysql> create index idx3 using ivfflat on t3(b) lists=2000 op_type "vector_l2_ops"; Query OK, 0 rows affected (1 min 43.13 sec)

% python3 query_mo_pg.py

Query 501 completed in 0.0309s with recall: 0.9300 Recall: 0.8863, Total Duration: 16.9130s, Avg Latency: 0.0338, QPS: 29.6222

For 4000 lists,

mysql> create index idx3 using ivfflat on t3(b) lists=4000 op_type "vector_l2_ops"; Query OK, 0 rows affected (3 min 33.26 sec)

% python3 query_mo_pg.py

Query 501 completed in 0.0295s with recall: 0.7900 Recall: 0.8312, Total Duration: 18.7827s, Avg Latency: 0.0375, QPS: 26.6735

new code

mysql> create index idx3 using ivfflat on t3(b) lists=500 op_type "vector_l2_ops";

Query OK, 0 rows affected (16.32 sec)

% python3 query_mo_pg.py

Recall: 0.9616, Total Duration: 11.4130s, Avg Latency: 0.0228, QPS: 43.8973

2000 lists,

mysql> create index idx3 using ivfflat on t3(b) lists=2000 op_type "vector_l2_ops"; Query OK, 0 rows affected (2 min 7.72 sec)

% python3 query_mo_pg.py

Recall: 0.8598, Total Duration: 6.5781s, Avg Latency: 0.0131, QPS: 76.1618

4000 lists,

mysql> create index idx3 using ivfflat on t3(b) lists=4000 op_type "vector_l2_ops"; Query OK, 0 rows affected (7 min 33.65 sec)

Query 501 completed in 0.1081s with recall: 0.7600 Recall: 0.7947, Total Duration: 5.7110s, Avg Latency: 0.0114, QPS: 87.7254

cpegeric avatar Mar 31 '25 10:03 cpegeric

Sometimes, after create index and call search, ivf search entries is slow. But after reboot and run search, ivf search entries is fast.

When slow, the waiting time for data is in total 1000ms. When fast, the waiting time for data is 100ms. 10x difference.

99% of the time is waiting for data. As long as we have the data faster, we can always improve the performance.

Long waiting time

probe nthread = 12, probe = 25... return centroid ids [2616 1610 1155 1241 2800 1895 413 2958 327 1049 2836 316 2934 590 2883 2386 45 2387 570 208 2580 1015 2612 2920 2594] timing: nthread = 12, find Centroids 911.917µs searchEntries: batch size 426, wait batch 5.202541ms, distance measure (batch) 96.209µs searchEntries: batch size 384, wait batch 9.877375ms, distance measure (batch) 81.375µs searchEntries: batch size 282, wait batch 14.454125ms, distance measure (batch) 57.625µs searchEntries: batch size 349, wait batch 18.967209ms, distance measure (batch) 81.208µs searchEntries: batch size 691, wait batch 23.696083ms, distance measure (batch) 158.459µs searchEntries: batch size 766, wait batch 28.22225ms, distance measure (batch) 174.875µs searchEntries: batch size 784, wait batch 32.7785ms, distance measure (batch) 171.333µs searchEntries: batch size 309, wait batch 37.271333ms, distance measure (batch) 66.084µs searchEntries: batch size 432, wait batch 41.778958ms, distance measure (batch) 95.875µs searchEntries: batch size 658, wait batch 46.448583ms, distance measure (batch) 156.417µs searchEntries: batch size 378, wait batch 50.935291ms, distance measure (batch) 82.709µs searchEntries: batch size 589, wait batch 55.061125ms, distance measure (batch) 123.958µs searchEntries: batch size 594, wait batch 53.903417ms, distance measure (batch) 136.208µs searchEntries: batch size 1365, wait batch 53.572416ms, distance measure (batch) 284.459µs searchEntries: batch size 847, wait batch 53.219208ms, distance measure (batch) 196.584µs searchEntries: batch size 1121, wait batch 52.884875ms, distance measure (batch) 228.958µs searchEntries: batch size 484, wait batch 52.163708ms, distance measure (batch) 116.709µs searchEntries: batch size 560, wait batch 51.7715ms, distance measure (batch) 121.792µs searchEntries: batch size 620, wait batch 51.430209ms, distance measure (batch) 137.791µs searchEntries: batch size 547, wait batch 51.171083ms, distance measure (batch) 117.584µs searchEntries: batch size 930, wait batch 50.797292ms, distance measure (batch) 195.833µs searchEntries: batch size 384, wait batch 50.223083ms, distance measure (batch) 100.5µs timing: nthread = 12, total batch = 22, total dist_ts = 2970 us, total wait_ts 885 ms, ivf search 99.074334ms timing: nthread = 12, heap n-best search 88.709µs

Short waiting time

probe nthread = 12, probe = 25... return centroid ids [670 1361 2374 1544 1215 287 2535 1424 1782 2074 2496 1651 1852 2975 1648 2215 818 1999 2969 62 2040 147 706 1246 2265] timing: nthread = 12, find Centroids 877.167µs searchEntries: batch size 761, wait batch 498.5µs, distance measure (batch) 177.458µs searchEntries: batch size 470, wait batch 616.208µs, distance measure (batch) 109.625µs searchEntries: batch size 241, wait batch 703.625µs, distance measure (batch) 55.417µs searchEntries: batch size 502, wait batch 842.625µs, distance measure (batch) 116.167µs searchEntries: batch size 395, wait batch 973.042µs, distance measure (batch) 84.708µs searchEntries: batch size 498, wait batch 1.080125ms, distance measure (batch) 116.083µs searchEntries: batch size 408, wait batch 1.191792ms, distance measure (batch) 91.292µs searchEntries: batch size 589, wait batch 1.542ms, distance measure (batch) 185.167µs searchEntries: batch size 373, wait batch 1.643625ms, distance measure (batch) 116.542µs searchEntries: batch size 1369, wait batch 1.414416ms, distance measure (batch) 371.959µs searchEntries: batch size 540, wait batch 1.76925ms, distance measure (batch) 120.625µs searchEntries: batch size 569, wait batch 1.385833ms, distance measure (batch) 166.875µs searchEntries: batch size 1225, wait batch 1.912833ms, distance measure (batch) 311.209µs searchEntries: batch size 413, wait batch 1.4365ms, distance measure (batch) 118.292µs searchEntries: batch size 284, wait batch 1.468583ms, distance measure (batch) 72.375µs searchEntries: batch size 1085, wait batch 1.5505ms, distance measure (batch) 252.959µs searchEntries: batch size 495, wait batch 1.483708ms, distance measure (batch) 105.5µs searchEntries: batch size 513, wait batch 1.456833ms, distance measure (batch) 124.042µs searchEntries: batch size 437, wait batch 1.469833ms, distance measure (batch) 105.292µs searchEntries: batch size 207, wait batch 1.253958ms, distance measure (batch) 56.292µs searchEntries: batch size 856, wait batch 1.184917ms, distance measure (batch) 220.333µs searchEntries: batch size 463, wait batch 1.351084ms, distance measure (batch) 102.541µs searchEntries: batch size 1036, wait batch 1.391875ms, distance measure (batch) 235.166µs searchEntries: batch size 570, wait batch 5.380333ms, distance measure (batch) 125.084µs searchEntries: batch size 503, wait batch 5.496708ms, distance measure (batch) 149.75µs searchEntries: batch size 357, wait batch 5.526125ms, distance measure (batch) 100.625µs timing: nthread = 12, total batch = 26, total dist_ts = 3780 us, total wait_ts 46 ms, ivf search 7.956833ms timing: nthread = 12, heap n-best search 40.875µs

cpegeric avatar Mar 31 '25 12:03 cpegeric

Sometimes you can reproduce the bug with the following. Slow SQL always come right after CREATE INDEX. Usually CREATE INDEX twice can reproduce the bug.

To reproduce the bug,

create database a; use a; create table t3(a int primary key, b vecf32(128));

cd /data1/heni/mo-benchmark-test/sift128 /usr/local/python3/bin/python3.10 insert_mo.py

mysql> SET experimental_ivf_index = 1;

mysql> create index idx3 using ivfflat on t3(b) lists=4000 op_type "vector_l2_ops";

% python3 query_mo_pg.py

mysql> drop index idx3 on t3;

mysql> create index idx3 using ivfflat on t3(b) lists=4000 op_type "vector_l2_ops";

% python3 query_mo_pg.py

The SQL to run:

    sql := fmt.Sprintf("SELECT `%s`, `%s` FROM `%s`.`%s` WHERE `%s` = %d AND `%s` IN (%s)",
            catalog.SystemSI_IVFFLAT_TblCol_Entries_pk,
            catalog.SystemSI_IVFFLAT_TblCol_Entries_entry,
            tblcfg.DbName, tblcfg.EntriesTable,
            catalog.SystemSI_IVFFLAT_TblCol_Entries_version,
            idx.Version,
            catalog.SystemSI_IVFFLAT_TblCol_Entries_id,
            instr,
    )

cpegeric avatar Mar 31 '25 14:03 cpegeric

Same SIFT test but run Slow (CPU profile)

Recall: 0.8260, Total Duration: 53.0852s, Avg Latency: 0.1060, QPS: 9.4377

Image Image

cpegeric avatar Mar 31 '25 16:03 cpegeric

fast sql CPU profile Recall: 0.8260, Total Duration: 6.0623s, Avg Latency: 0.0121, QPS: 82.6419

Image Image

cpegeric avatar Mar 31 '25 16:03 cpegeric

heap profile for slow test

Image

cpegeric avatar Mar 31 '25 16:03 cpegeric

Same plan but different running time ( 143ms vs 12ms)

SQL: SELECT __mo_index_pri_col, __mo_index_centroid_fk_entry FROM a.__mo_index_secondary_0195f094-f18e-7a0a-9a81-dca5bff76b21 WHERE __mo_index_centroid_fk_version = 0 AND __mo_index_centroid_fk_id IN (2517,2478,2230,128,936,2225,1979,1506,723,2782,2117,2780,64,1252,1510,1980,747,2789,1227,2489,2992,980,2070,1819,2595)

Image Image

cpegeric avatar Apr 01 '25 09:04 cpegeric

it is about cache. After reboot, run explain analyze will be around 150ms and it read from file. After that, the time is fast around 12ms which don't read from file.

Image

but after create index. run explain analyze always read from file and slow and data never get cached.

cpegeric avatar Apr 01 '25 15:04 cpegeric

set cache config in etc/launch/cn.toml will fix the issue. Default cache size is 512M.

[[fileservice]] name = "LOCAL" backend = "DISK"

[[fileservice]] name = "SHARED" backend = "DISK" data-dir = "mo-data/shared"

[fileservice.cache] memory-capacity = "8GB" disk-capacity = "8GB" disk-path = "mo-data/file-service-cache"

cpegeric avatar Apr 03 '25 16:04 cpegeric

@cpegeric 当前问题环境cn配置: Image

heni02 avatar Apr 10 '25 03:04 heni02

目前main分支(最新commit)create index性能已经恢复到之前的水平,2.1-dev(09bb68d8a) create index还没有恢复,还是很慢(两个分支测试环境etc配置都相同) @cpegeric 麻烦看下 main : Image Image

2.1-dev: Image Image

heni02 avatar Apr 15 '25 06:04 heni02

slow create index may cause by the issue https://github.com/matrixorigin/matrixone/issues/21456

cpegeric avatar Apr 22 '25 08:04 cpegeric

@cpegeric this is worthy of porting to 2.1 branch, unless it is HUGE change. Quote: "目前main分支(最新commit)create index性能已经恢复到之前的水平,2.1-dev(09bb68d8a) create index还没有恢复"

fengttt avatar Apr 27 '25 08:04 fengttt

@fengttt there are more than 60 files changes. I think it is too much changes. Isn't it?

cpegeric avatar May 09 '25 15:05 cpegeric

please verify after the 2.1-dev after code merged.

cpegeric avatar May 12 '25 09:05 cpegeric

下一版本测试验证

heni02 avatar May 30 '25 06:05 heni02

main分支commit:fcbc898e63验证sift128和gist960 ivf qps较优化前有提升,closed

Image Image

heni02 avatar Aug 06 '25 03:08 heni02