ustore icon indicating copy to clipboard operation
ustore copied to clipboard

Smarter bulk & unordered scans

Open ashvardanian opened this issue 3 years ago • 2 comments

Currently ukv_scan is only working for fully consistent sorted exported of keys from collections. With the bulk flag we allow prioritizing throughput over consistency, but a point can be made, that ML-like pipelines don’t need any dependency in operations whatsoever. Instead they may use scans to uniformly random-sample entries, which would in turn require a full scan of keys. If the user leaves start_key unset, we can perform the bulk sampling behind the curtains ourselves. It will make the interface more ugly by making a function dual-use, but will keep the interface short. Worth considering.

ashvardanian avatar Aug 28 '22 22:08 ashvardanian

Those changes should preceed #17 to have a finalized scan interface.

ashvardanian avatar Aug 28 '22 22:08 ashvardanian

If the bulk flag is provided, we can treat the passed keys as not start keys but instead the last keys in the previous batch.

ashvardanian avatar Oct 18 '22 12:10 ashvardanian