qp-trie-rs
qp-trie-rs copied to clipboard
Types design
Why should the Trie itself be parameterized over key type? Currently both having K as type parameter and needing it passed as owned to insert and (especially) to entry pretty much forces us to make copy of key for each invocation. I think we only need key when first inserting into the trie, which could be better achieved with ToOwned (btw this comment makes sense for me: https://github.com/sdleffler/qp-trie-rs/blob/91507d5cfee53bafdeeb69deb36140f496a3bbaa/src/node.rs#L245). This way, insert and entry would require Borrow + ToOwned and would clone only when really necessary.
Then if we stored [u8] in some form in node and parameterize functions like insert and entry themselves which would allow passing them &'a [u8] with different 'a which is currently impossible.
Storing [u8] in a node can only be done by some sort of heap indirection (minus small_vec and similar optimizations.) Using ToOwned/Borrow escapes the need for any such thing when using small or fixed-size keys, for example small byte arrays ([u8; 4].)
In addition, Trie is (despite having a byte slice interface) effectively a key-value map. I don't like the idea of allowing users to insert keys of multiple different types into the same Trie; that seems like a good recipe for a type error, and something that should be solved by making an enum of different key types. Do you have a use case for this lack of parametricity over K?
But isn't it the case that the keys are usually non-fixed-length?
My use case involves storing (small number of) different strings and finding/inserting (if new) them frequently (like 1 mil rps). I really don't want to clone key on every request, and I cannot pass string ownership to entry() either. The fact that Trie (persistent) is parameterized over K and lifetime of key must be at least that of Trie I cannot specify K to reference type.
Now imagine if Trie wasn't parameterized (I'm new to rust so not sure on exact signature)
struct Trie(Vec<u8>);
impl Trie {
fn insert<K: ?Sized>(&mut self, key: &K)
where
Vec<u8>: Borrow<K>,
K: ToOwned<Owned = Vec<u8>>,
{
self.0 = key.to_owned();
}
}
This would allow me pass &[u8] as key and it would only be copied when necessary.
It does not ring true for me that cloning the string is necessary. It's been a while since I looked at this code but Borrow<K> should allow you to pass in a reference instead and have it not be cloned unless it is not in the trie. Have you looked through the source? I'll need to dig through to check this.
I could pass reference (instantiate K to reference type), but I would have to fix lifetime of references which I can pass to insert/entry for each Trie instance, which is exactly why I have a problem with parameterizing the Trie data type.
Ok so maybe example:
let mut trie: Trie<&[u8], u16> = Trie::new();
{
let keyFromSomewhere = [a, b, c];
trie.entry(&keyFromSomewhere[..]);
}
{
let keyFromSomewhereElse = [d, e, f];
trie.entry(&keyFromSomewhereElse[..]);
}
This currently won't work because lifetimes of two references will not match. It would not be a problem if the function entry was parameterized instead of whole Trie.
Mm, I see your problem.
This does not need to be solved by removing the K parameter (and I am immediately against that solution, I think it destroys part of the type safety which is preserved by its presence.) It should be possible to add a new type variable in insert, call L: ToOwned<Owned = K>, for which you can pass in either K or &K (or other types implementing ToOwned<Owned = K>. Calling to_owned on K would be a noop and simply pass the value through, while calling to_owned on &K would end in a clone.
I don't remember if there's a reason I didn't implement it this way. It may have been an oversight, or caused other issues with types. Will try to take a look tonight and see if I can figure it out.
If these keys will never be removed, and it is acceptable to keep their memory for the lifetime of your binary, I believe there are libraries which can be used to intentionally leak memory, getting you an &'static [u8] which will not have the lifetime issue. Could be a usable temporary solution.
Thanks for having a look at the issue. However considering I want to avoid copying the keys because of high intensity of requests, I dont think leaking the keys memory forever is a solution for me =)