raft icon indicating copy to clipboard operation
raft copied to clipboard

[FEA] Batched k-selection

Open cjnolet opened this issue 1 year ago • 2 comments

This issue is to track discussion and implementation of an efficient mechanism for supporting batched k-selection. The current discussion has been around allowing "restart" in batches (potentially by storing off the max distance found thus far and implementing an ANN pre-filter to exclude all neighbors below that distance). Ultimately, we would like to support batching the computation of the k's so a user could say "give me the closest 1-5 neighbors, then the closest 6-10 neighbors, then the closest 11-15 neighbors, etc...) and not have to load the entire set in memory at the same time.

cjnolet avatar Jun 01 '23 16:06 cjnolet

I talked to @spartee. He said they call this query aggregation and they have recently added it on the CPU side. @spartee can add more detail.

nwstephens avatar Oct 05 '23 18:10 nwstephens

On the raft's side we often use the word batch for the groups of queries processed by a single select_k. How about changing the word batched here with paged to depict that we page through a single list of k_big neighbors, k_small neighbors at a time?

achirkin avatar Jan 24 '24 05:01 achirkin