sled icon indicating copy to clipboard operation
sled copied to clipboard

Some more `*_prefix` operations

Open emfax opened this issue 4 years ago • 1 comments

Use Case:

Remove all entries with a given prefix. If you're storing a large files in chunks, where each chunk's key has some ordered suffix, removing all the entries would be easy if there was a remove_prefix method.

First and last entries with a given prefix. Again, if you're storing a large files in chunks, it would be nice to be able to use first_prefix or last_prefix where these methods return the first or last entries within a given prefix respectively.

Proposed Change:

Add methods remove_prefix, first_prefix and last_prefix to the Tree implementation.

Who Benefits From The Change(s)?

Users who want to store blobs of data in separate entries, but under some circumstances want to treat it as one unit.

Alternative Approaches

Create a separate Tree for such cases. drop_tree, first and last could be used. It seems like this might be overkill? If a user wanted to store thousands of files, would you need a Tree for every file? Isn't that a lot of overhead?

emfax avatar Nov 16 '20 22:11 emfax

I was going to comment on this pointing out another use case for first_prefix and last_prefix, but after some thought, I've concluded I don't think these are necessary.

Simply put, I am not sure if any utility method could be any faster than writing it one's self:

  • first_prefix can be written as scan_prefix(...).next()
  • last_prefix can be written as scan_prefix(...).last()
  • The bulk removal case mentioned @ttxndrx is simply sled.scan_prefix(...).keys().for_each(|k| sled.remove(k)) in a transaction.

A quick look at the internals suggests that these are well-optimized. Key iterators are double-ended, making both next and last fast. My limited understanding of Sled's lock-free structures suggests that deleting 100 keys safely inherently requires 100 copies, 100 log allocations, 100 partially-contended writes, etc., and ordering of these operations within the lock safety rules doesn't have a significant impact.

All that said, I am presuming that key iterators return matching keys in lexicographic order. Is this a guarantee of the API, @spacejam? Internals suggest it probably is, but I couldn't find it explicitly stated in the docs anywhere.

jhwgh1968 avatar Dec 28 '20 15:12 jhwgh1968