lance icon indicating copy to clipboard operation
lance copied to clipboard

implement deletion vector handling in index scan

Open chebbyChefNEQ opened this issue 1 year ago • 3 comments

closes #916

Implement deletion vector handling in PQIndex, where all vector index are stored.

TODO:

  • [ ] add tests
  • [ ] add TODO comments in places we could have potential performance gains

chebbyChefNEQ avatar Jun 09 '23 01:06 chebbyChefNEQ

TL;DR:

  • Seems to have very little latency impact
  • We should also test query throughput than just latency.

after picking in #960 the difference of index scan performance pre/post this PR is negligible. Any perf impact introduced is smaller in magnitude than noise.

The impact maybe become more pronunciated in a environment where IO latency is high, say, EC2 instance with EBS, which is a remote disk.

Since this change introduces more IO, I'd hypothesis that there could be a hit to the query throughput

chebbyChefNEQ avatar Jun 11 '23 21:06 chebbyChefNEQ

Seems to have very little latency impact

is this for scanning or the ANN search?

eddyxu avatar Jun 11 '23 22:06 eddyxu

is this for scanning or the ANN search? ann search. The bench mark uses nearest.

chebbyChefNEQ avatar Jun 11 '23 23:06 chebbyChefNEQ