ustore
ustore copied to clipboard
Smarter bulk & unordered scans
Currently ukv_scan is only working for fully consistent sorted exported of keys from collections.
With the bulk flag we allow prioritizing throughput over consistency, but a point can be made, that ML-like pipelines don’t need any dependency in operations whatsoever. Instead they may use scans to uniformly random-sample entries, which would in turn require a full scan of keys. If the user leaves start_key unset, we can perform the bulk sampling behind the curtains ourselves.
It will make the interface more ugly by making a function dual-use, but will keep the interface short. Worth considering.
Those changes should preceed #17 to have a finalized scan interface.
If the bulk flag is provided, we can treat the passed keys as not start keys but instead the last keys in the previous batch.