test: add regression test for ivf/pq search
The test covers 5 parameters (cache, k, nprobes, refine factor, dataset size) for a total of 64 tests. It takes ~3-4 minutes to run on my system.
There are other parameters (distance type, number of dimensions, PQ-v-SQ, etc.) but these should only really affect compute time and/or recall and are probably better tested in rust benchmarks. Recall benchmarks will be done separately as they require real data.
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. Credits must be used to enable repository wide code reviews.
@bench-bot benchmark
Benchmark Results for PR #5476
Commit: e026964
Baseline: Up to 20 most recent historical results per benchmark
Summary
- Total benchmarks: 123
- ๐ Improvements: 0
- โ ๏ธ Regressions: 0
- โ Stable: 123
- โ Insufficient data: 0
โ All Benchmarks Within Normal Range
No benchmarks had |z-score| > 2.0
All Results
View all 123 benchmark results
| Benchmark | PR Result | Baseline Mean | Baseline N | Z-Score | Status |
|---|---|---|---|---|---|
| Cosine(f32, auto-vectorized) | 102.97 ms | 153.39 ms | 20 | -1.19 | โ |
| Cosine(f32, scalar) | 681.69 ms | 820.47 ms | 20 | -0.79 | โ |
| Cosine(f64, auto-vectorized) | 290.17 ms | 404.36 ms | 20 | -1.02 | โ |
| Cosine(f64, scalar) | 691.42 ms | 836.72 ms | 20 | -0.79 | โ |
| Cosine(half::bfloat::bf16, auto-vectorized) | 268.18 ms | 362.38 ms | 20 | -0.86 | โ |
| Cosine(half::bfloat::bf16, scalar) | 7.25 s | 9.15 s | 20 | -0.86 | โ |
| Cosine(half::binary16::f16, auto-vectorized) | 127.35 ms | 154.34 ms | 20 | -0.80 | โ |
| Cosine(half::binary16::f16, scalar) | 6.28 s | 7.79 s | 20 | -0.80 | โ |
| Cosine(simd,f32x8) rng seed | 2.81 ms | 3.73 ms | 20 | -0.93 | โ |
| Dot(bf16, auto-vectorization) | 409.46 ms | 546.63 ms | 20 | -0.83 | โ |
| Dot(f16, SIMD) | 70.48 ms | 91.43 ms | 20 | -0.95 | โ |
| Dot(f32, SIMD) | 101.22 ms | 144.37 ms | 20 | -1.14 | โ |
| Dot(f32, arrow_artiy) | 637.93 ms | 792.48 ms | 20 | -0.91 | โ |
| Dot(f32, auto-vectorization) | 96.77 ms | 150.78 ms | 20 | -1.45 | โ |
| Dot(f64, arrow_artiy) | 637.73 ms | 797.02 ms | 20 | -0.90 | โ |
| Dot(f64, auto-vectorization) | 188.91 ms | 287.17 ms | 20 | -1.22 | โ |
| Dot(half::binary16::f16, arrow_artiy) | 728.39 ms | 870.45 ms | 20 | -0.90 | โ |
| Dot(half::binary16::f16, auto-vectorization) | 70.01 ms | 92.24 ms | 20 | -1.09 | โ |
| L2(f32, auto-vectorization) | 104.57 ms | 149.21 ms | 20 | -1.03 | โ |
| L2(f32, scalar) | 659.71 ms | 804.28 ms | 20 | -0.84 | โ |
| L2(f32, simd) | 104.29 ms | 146.38 ms | 20 | -1.00 | โ |
| L2(f64, auto-vectorization) | 198.41 ms | 297.23 ms | 20 | -1.21 | โ |
| L2(f64, scalar) | 656.48 ms | 815.34 ms | 20 | -0.89 | โ |
| L2(half::binary16::f16, auto-vectorization) | 75.95 ms | 94.50 ms | 20 | -0.89 | โ |
| L2(half::binary16::f16, scalar) | 2.82 s | 3.56 s | 20 | -0.86 | โ |
| L2(simd,f32x8) | 4.60 ms | 5.62 ms | 20 | -0.84 | โ |
| L2(uint8, auto-vectorization) | 54.67 ms | 66.65 ms | 20 | -0.95 | โ |
| L2(uint8, scalar) | 50.17 ms | 61.07 ms | 20 | -1.10 | โ |
| NormL2(f32, SIMD) | 97.66 ms | 142.88 ms | 20 | -1.10 | โ |
| NormL2(f32, auto-vectorization) | 136.33 ms | 142.08 ms | 20 | -0.14 | โ |
| NormL2(f32, scalar) | 659.87 ms | 787.33 ms | 20 | -0.73 | โ |
| NormL2(f64, auto-vectorization) | 264.31 ms | 296.55 ms | 20 | -0.34 | โ |
| NormL2(f64, scalar) | 670.35 ms | 807.86 ms | 20 | -0.75 | โ |
| NormL2(half::bfloat::bf16, auto-vectorization) | 261.19 ms | 339.64 ms | 20 | -0.80 | โ |
| NormL2(half::bfloat::bf16, scalar) | 2.79 s | 3.32 s | 20 | -0.77 | โ |
| NormL2(half::binary16::f16, SIMD) | 71.74 ms | 84.97 ms | 20 | -0.57 | โ |
| NormL2(half::binary16::f16, auto-vectorization) | 72.01 ms | 84.56 ms | 20 | -0.54 | โ |
| NormL2(half::binary16::f16, scalar) | 679.61 ms | 810.05 ms | 20 | -0.73 | โ |
| argmin(arrow) | 4.83 ms | 4.64 ms | 20 | +1.02 | โ |
| decode_fsl/float32_128_v2.0_nullfalse | 32.17 ms | 48.35 ms | 20 | -0.83 | โ |
| decode_fsl/float32_128_v2.1_nullfalse | 10.73 ms | 15.65 ms | 20 | -1.96 | โ |
| decode_fsl/float32_128_v2.1_nulltrue | 22.12 ms | 32.22 ms | 20 | -1.61 | โ |
| decode_fsl/float32_16_v2.0_nullfalse | 30.40 ms | 48.20 ms | 20 | -0.91 | โ |
| decode_fsl/float32_16_v2.1_nullfalse | 17.12 ms | 24.29 ms | 20 | -1.22 | โ |
| decode_fsl/float32_16_v2.1_nulltrue | 28.48 ms | 37.70 ms | 20 | -1.00 | โ |
| decode_fsl/float32_32_v2.0_nullfalse | 30.45 ms | 48.07 ms | 20 | -0.90 | โ |
| decode_fsl/float32_32_v2.1_nullfalse | 18.69 ms | 24.28 ms | 20 | -0.96 | โ |
| decode_fsl/float32_32_v2.1_nulltrue | 24.01 ms | 34.03 ms | 20 | -1.24 | โ |
| decode_fsl/float32_4_v2.0_nullfalse | 30.61 ms | 48.43 ms | 20 | -0.91 | โ |
| decode_fsl/float32_4_v2.1_nullfalse | 17.46 ms | 25.27 ms | 20 | -1.43 | โ |
| decode_fsl/float32_4_v2.1_nulltrue | 47.58 ms | 63.80 ms | 20 | -1.08 | โ |
| decode_fsl/float32_64_v2.0_nullfalse | 30.62 ms | 48.08 ms | 20 | -0.90 | โ |
| decode_fsl/float32_64_v2.1_nullfalse | 11.69 ms | 16.29 ms | 20 | -1.23 | โ |
| decode_fsl/float32_64_v2.1_nulltrue | 25.29 ms | 33.65 ms | 20 | -1.32 | โ |
| decode_fsl/int8_128_v2.0_nullfalse | 30.41 ms | 48.36 ms | 20 | -0.92 | โ |
| decode_fsl/int8_128_v2.1_nullfalse | 17.43 ms | 25.29 ms | 20 | -1.49 | โ |
| decode_fsl/int8_128_v2.1_nulltrue | 24.19 ms | 33.89 ms | 20 | -1.18 | โ |
| decode_fsl/int8_16_v2.0_nullfalse | 30.25 ms | 48.20 ms | 20 | -0.92 | โ |
| decode_fsl/int8_16_v2.1_nullfalse | 17.42 ms | 25.00 ms | 20 | -1.39 | โ |
| decode_fsl/int8_16_v2.1_nulltrue | 47.81 ms | 63.93 ms | 20 | -1.13 | โ |
| decode_fsl/int8_32_v2.0_nullfalse | 30.14 ms | 48.15 ms | 20 | -0.93 | โ |
| decode_fsl/int8_32_v2.1_nullfalse | 17.01 ms | 24.92 ms | 20 | -1.46 | โ |
| decode_fsl/int8_32_v2.1_nulltrue | 34.96 ms | 47.09 ms | 20 | -1.10 | โ |
| decode_fsl/int8_4_v2.0_nullfalse | 30.06 ms | 48.17 ms | 20 | -0.94 | โ |
| decode_fsl/int8_4_v2.1_nullfalse | 19.46 ms | 26.52 ms | 20 | -1.23 | โ |
| decode_fsl/int8_4_v2.1_nulltrue | 131.40 ms | 174.89 ms | 20 | -1.09 | โ |
| decode_fsl/int8_64_v2.0_nullfalse | 30.03 ms | 48.54 ms | 20 | -0.94 | โ |
| decode_fsl/int8_64_v2.1_nullfalse | 17.60 ms | 24.32 ms | 20 | -1.17 | โ |
| decode_fsl/int8_64_v2.1_nulltrue | 27.24 ms | 37.85 ms | 20 | -1.19 | โ |
| decode_primitive/date32 | 11.51 ms | 15.89 ms | 20 | -1.55 | โ |
| decode_primitive/date64 | 11.98 ms | 15.97 ms | 20 | -1.38 | โ |
| decode_primitive/decimal128(10, 10) | 14.50 ms | 16.43 ms | 20 | -0.72 | โ |
| decode_primitive/decimal256(10, 10) | 29.05 ms | 34.60 ms | 20 | -1.18 | โ |
| decode_primitive/duration(second) | 11.60 ms | 16.97 ms | 20 | -1.29 | โ |
| decode_primitive/fixed-utf8 | 65.79 ยตs | 95.95 ยตs | 20 | -1.20 | โ |
| decode_primitive/float16 | 13.44 ms | 16.00 ms | 20 | -0.95 | โ |
| decode_primitive/float32 | 13.69 ms | 16.34 ms | 20 | -0.89 | โ |
| decode_primitive/float64 | 14.20 ms | 16.49 ms | 20 | -0.64 | โ |
| decode_primitive/int16 | 11.91 ms | 15.82 ms | 20 | -1.33 | โ |
| decode_primitive/int32 | 13.20 ms | 15.91 ms | 20 | -0.92 | โ |
| decode_primitive/int64 | 12.02 ms | 16.24 ms | 20 | -1.37 | โ |
| decode_primitive/int8 | 12.06 ms | 15.99 ms | 20 | -1.35 | โ |
| decode_primitive/struct | 136.30 ยตs | 186.10 ยตs | 20 | -1.39 | โ |
| decode_primitive/time32(second) | 11.99 ms | 16.39 ms | 20 | -1.68 | โ |
| decode_primitive/time64(nanosecond) | 12.19 ms | 16.37 ms | 20 | -1.53 | โ |
| decode_primitive/timestamp(nanosecond, none) | 12.27 ms | 16.25 ms | 20 | -1.33 | โ |
| decode_primitive/uint16 | 13.33 ms | 16.22 ms | 20 | -0.76 | โ |
| decode_primitive/uint32 | 12.86 ms | 16.31 ms | 20 | -1.13 | โ |
| decode_primitive/uint64 | 13.78 ms | 16.30 ms | 20 | -0.71 | โ |
| decode_primitive/uint8 | 12.64 ms | 15.65 ms | 20 | -1.00 | โ |
| decode_primitive/utf8 | 752.31 ยตs | 954.59 ยตs | 20 | -0.97 | โ |
| from_elem/full_read,parallel=1,read_size=1048576 | 36.39 ms | 165.77 ms | 20 | -0.88 | โ |
| from_elem/full_read,parallel=1,read_size=16384 | 262.56 ms | 326.30 ms | 20 | -0.66 | โ |
| from_elem/full_read,parallel=1,read_size=4096 | 938.26 ms | 1.15 s | 20 | -0.56 | โ |
| from_elem/full_read,parallel=16,read_size=1048576 | 35.74 ms | 165.92 ms | 20 | -0.88 | โ |
| from_elem/full_read,parallel=16,read_size=16384 | 261.63 ms | 318.14 ms | 20 | -0.64 | โ |
| from_elem/full_read,parallel=16,read_size=4096 | 924.90 ms | 1.14 s | 20 | -0.59 | โ |
| from_elem/full_read,parallel=32,read_size=1048576 | 35.81 ms | 165.97 ms | 20 | -0.88 | โ |
| from_elem/full_read,parallel=32,read_size=16384 | 258.51 ms | 318.46 ms | 20 | -0.71 | โ |
| from_elem/full_read,parallel=32,read_size=4096 | 929.44 ms | 1.14 s | 20 | -0.57 | โ |
| from_elem/full_read,parallel=64,read_size=1048576 | 35.82 ms | 166.02 ms | 20 | -0.88 | โ |
| from_elem/full_read,parallel=64,read_size=16384 | 258.62 ms | 317.58 ms | 20 | -0.69 | โ |
| from_elem/full_read,parallel=64,read_size=4096 | 919.57 ms | 1.15 s | 20 | -0.63 | โ |
| from_elem/random_read,parallel=1,item_size=1024 | 4.13 ms | 4.16 ms | 20 | -0.04 | โ |
| from_elem/random_read,parallel=1,item_size=4096 | 4.66 ms | 6.95 ms | 20 | -0.83 | โ |
| from_elem/random_read,parallel=1,item_size=8 | 1.81 ms | 1.85 ms | 20 | -0.12 | โ |
| from_elem/random_read,parallel=16,item_size=1024 | 4.15 ms | 4.11 ms | 20 | +0.05 | โ |
| from_elem/random_read,parallel=16,item_size=4096 | 4.64 ms | 6.94 ms | 20 | -0.82 | โ |
| from_elem/random_read,parallel=16,item_size=8 | 1.82 ms | 1.84 ms | 20 | -0.05 | โ |
| from_elem/random_read,parallel=32,item_size=1024 | 4.11 ms | 4.10 ms | 20 | +0.02 | โ |
| from_elem/random_read,parallel=32,item_size=4096 | 4.91 ms | 6.93 ms | 20 | -0.72 | โ |
| from_elem/random_read,parallel=32,item_size=8 | 1.83 ms | 1.85 ms | 20 | -0.05 | โ |
| from_elem/random_read,parallel=64,item_size=1024 | 4.11 ms | 4.10 ms | 20 | +0.02 | โ |
| from_elem/random_read,parallel=64,item_size=4096 | 4.73 ms | 6.91 ms | 20 | -0.77 | โ |
| from_elem/random_read,parallel=64,item_size=8 | 1.83 ms | 1.84 ms | 20 | -0.02 | โ |
| hamming,auto_vec | 74.65 ms | 96.16 ms | 20 | -0.90 | โ |
| hamming,scalar | 118.43 ms | 160.09 ms | 20 | -0.87 | โ |
| zip_1024Ki/2_2_2_zip_into_6 | 6.20 ms | 7.98 ms | 20 | -0.82 | โ |
| zip_1024Ki/2_4_zip_into_6 | 4.61 ms | 5.65 ms | 20 | -0.85 | โ |
| zip_32Ki/2_2_2_zip_into_6 | 247.84 ยตs | 254.99 ยตs | 20 | -0.10 | โ |
| zip_32Ki/2_4_zip_into_6 | 154.40 ยตs | 178.00 ยตs | 20 | -0.65 | โ |
| zip_8Ki/2_2_2_zip_into_6 | 42.65 ยตs | 67.75 ยตs | 20 | -1.42 | โ |
| zip_8Ki/2_4_zip_into_6 | 34.56 ยตs | 45.52 ยตs | 20 | -1.08 | โ |
Generated by bench-bot ๐ค