schnellru Added LruMap::try_insert for non-destructive faillible inserts

Jul 10 '23 08:07 OtaK

Thanks for the PR!

Unfortunately it isn't entirely correct as it doesn't actually guarantee the non-destructiveness property. If there already exists a key in the map but the limiter rejects the new item then the old one will be removed, directly contradicting the "will not try removing elements to insert this one" line in the doc comment. And if there's not enough physical space in the map and growing it fails it will also still try to remove items from it.

There's also the issue that you've disabled the post-insert popping of items, which means that it can now violate the constraints imposed by the limiter.

Also, I think a better name would probably be insert_without_removing or something like that, since the original insert is already essentially a fallible operation.

Jul 19 '23 09:07 koute

Hi @koute,

I'd like to take over this to get it to a state where it can be merged (we at Wire have a fork of schnellru which we would like to get rid of and go back to follow upstream if possible).

To that end, I have a question:

If there already exists a key in the map but the limiter rejects the new item then the old one will be removed, directly contradicting the "will not try removing elements to insert this one" line in the doc comment.

This is caused by

if !can_replace {                                                                 
    // Even if we remove the old item we still can't afford to add the new one;   
    // just remove the old item and bail out.                                     
                                                                                  
    // SAFETY: Bucket exists since we just found it.                              
    unsafe { self.remove_bucket_noinline(bucket) };                               
    return false;

Could you please explain the reasoning behind this, i.e. why do we remove the old item if we can't replace it with the new one?

Jul 15 '24 11:07 istankovic

Could you please explain the reasoning behind this, i.e. why do we remove the old item if we can't replace it with the new one?

Imagine this is used as a cache (which is one of the most common uses of an LRU map). You cache A = B. Now A changes to A = C and you try to insert it into the map again. But this doesn't fit anymore, so what do you do? You either leave A = B (which makes your cache incorrect) or remove A altogether from the cache (which means you'll have to go through a cache miss on the next access, but your code stays correct).

Jul 15 '24 13:07 koute

Imagine this is used as a cache (which is one of the most common uses of an LRU map). You cache A = B. Now A changes to A = C and you try to insert it into the map again. But this doesn't fit anymore, so what do you do? You either leave A = B (which makes your cache incorrect) or remove A altogether from the cache (which means you'll have to go through a cache miss on the next access, but your code stays correct).

Hm, yes, I guess it makes sense, looking from that perspective.

However, I'm not sure then how to proceed here, since

we can't change the current semantics of insert, and
we can't allow modification of the existing map if the insert_without_removing variant is called

We could consider removing the existing bucket in the above case only if make_room is false, but... that feels hacky somehow. I'm curious to hear your suggestions.

Jul 15 '24 15:07 istankovic

What are the exact semantics you're looking for here?

Are you looking for an ordered map that will always replace an existing item on insert and will never implicitly remove other items? In that case you don't need to add an extra method - the map already works like this if you use an appropriate limiter.

Jul 15 '24 15:07 koute

Are you looking for an ordered map that will always replace an existing item on insert and will never implicitly remove other items? In that case you don't need to add an extra method - the map already works like this if you use an appropriate limiter.

I think that's the case, yes, and in fact, the existing insert already behaves like that due to the limiter we use, which is a combination of ByMemUsage and ByLength and it just returns true in our on_replace function. So AFAICT, we'll never remove an old element on insert. If this is the case, then we can just drop both this PR and our fork.

Jul 16 '24 08:07 istankovic

Thanks @koute for the help here, we just dropped our fork and don't need this anymore.

Jul 16 '24 10:07 istankovic