sled
sled copied to clipboard
Some more `*_prefix` operations
Use Case:
Remove all entries with a given prefix. If you're storing a large files in chunks, where each chunk's key has some ordered suffix, removing all the entries would be easy if there was a remove_prefix
method.
First and last entries with a given prefix. Again, if you're storing a large files in chunks, it would be nice to be able to use first_prefix
or last_prefix
where these methods return the first or last entries within a given prefix respectively.
Proposed Change:
Add methods remove_prefix
, first_prefix
and last_prefix
to the Tree
implementation.
Who Benefits From The Change(s)?
Users who want to store blobs of data in separate entries, but under some circumstances want to treat it as one unit.
Alternative Approaches
Create a separate Tree
for such cases. drop_tree
, first
and last
could be used. It seems like this might be overkill? If a user wanted to store thousands of files, would you need a Tree
for every file? Isn't that a lot of overhead?
I was going to comment on this pointing out another use case for first_prefix
and last_prefix
, but after some thought, I've concluded I don't think these are necessary.
Simply put, I am not sure if any utility method could be any faster than writing it one's self:
-
first_prefix
can be written asscan_prefix(...).next()
-
last_prefix
can be written asscan_prefix(...).last()
- The bulk removal case mentioned @ttxndrx is simply
sled.scan_prefix(...).keys().for_each(|k| sled.remove(k))
in a transaction.
A quick look at the internals suggests that these are well-optimized. Key iterators are double-ended, making both next
and last
fast. My limited understanding of Sled's lock-free structures suggests that deleting 100 keys safely inherently requires 100 copies, 100 log allocations, 100 partially-contended writes, etc., and ordering of these operations within the lock safety rules doesn't have a significant impact.
All that said, I am presuming that key iterators return matching keys in lexicographic order. Is this a guarantee of the API, @spacejam? Internals suggest it probably is, but I couldn't find it explicitly stated in the docs anywhere.