Andrew Kane comments

Results 378 comments of


                                            Andrew Kane

Iterative index scans

@hlinnaka I think the main benefits to batching are: 1. Minimizing out of order elements (elements can only be out of order between batches) 2. Less code duplication 3. Possibly...

Iterative index scans

Hi @alanwli, 1. Correct. We could filter out tuples that are closer than the last one returned to avoid this, but this could affect recall. 2. The scan will end...

Iterative index scans

Looking at SIFT 1M, a decent % of tuples can be discarded if strict ordering is implemented (with a filter selectivity of 0.01, usually 2-7%, but sometimes more than 10%).

Iterative index scans

The additional incorrect semantic is results may not be strictly ordered (which won't happen with GiST). I'd like to minimize these, as it's not intuitive for users, but think the...

Iterative index scans

This is true from the user's perspective :) It'd be great if we could do something similar, but the recheck code in Postgres requires strict ordering. Edit: Going to try...

Iterative index scans

Some data on how batch size and selectivity affect out of order results with SIFT 1M (default parameters, limit 20). batch_size | sel 0.1 | sel 0.01 --- | ---...

Some benchmarking results from the [`arxiv` dataset](https://github.com/qdrant/ann-filtering-benchmark-datasets) (2.1M rows, 384 dimensions, first 100 queries, default build parameters, 6GB shared buffers) Branch | Time | Recall --- | --- | ---...

Iterative index scans

Great, just ran the benchmark above with strict ordering enabled ([branch](https://github.com/pgvector/pgvector/compare/hnsw-streaming...hnsw-streaming-strict)). It drops the recall from 95.8 to 91.8. Edit: More data on strict ordering and ef_search (batch size) ef_search...

Iterative index scans

One idea is to allow users to select the mode: ```tsql SET hnsw.iterative_search = off; -- default for 0.8.0 SET hnsw.iterative_search = on; -- strict, default for future version SET...

Update Cargo.toml

Hi @Dylan-DPC, thanks for the PR. From what I can tell (testing w/ `rust_decimal`, which incorporated this in `1.37.0`): - If an earlier version of Diesel is installed, this will...