quick-cache `before_evict` + `weight` API is stateful/awkward (it can't effectively do pining)

Hey there,

I'm currently building a system where some data is "strongly" held in the cache (and thus never evicted) until it is synced to a persistent store after which it is "weakly" held, and thus elegible for eviction.

The ability to set the weight of an entry to 0 before eviction and thus prevent it from being evicted seems ideal for this. Howeve I found the API to be a bit awkward as it requires statefully remembering that an entry is about to be evicted and then adjusting the weight accordingly.

E.g. in my case the weight of each entry isvalue.len() normally and if Arc::strong_count(value) == 1 {0} else {value.len()} on eviction.

After looking through the code it looks like the pre-eviction weight is only ever used ephemerally to determine if it is ==0 or below the remaining weight threshold. So the semantic seems to be last chance to adjust your weight to show that you really really want to be in the cache, in which case one will probably always choose 0.

So it seems to me that this semantic might be more cleanly captured by always having weight return the "true weight of the entry" but give before_evict the ability to prevent eviction, by returning a boolean.

Thoughts? 😄

Oct 02 '23 08:10 somethingelseentirely

I'm not sure what you mean by stateful in this case. You mention you have to remember an item is about to be evicted, but I don't follow why or how that would change with your proposed change. Maybe you can clarify further?

The lifecycle hooks design is very experimental at this point. You can still do some useful stuff with it, as you found out yourself.

Initially I planned to add hooks for cold/hot changes which could also change weights. This design would make more sense with those.

So it seems to me that this semantic might be more cleanly captured by always having weight return the "true weight of the entry" but give before_evict the ability to prevent eviction, by returning a boolean.

I think you are suggesting/looking for "cache pinning". A pinned item cannot be evicted from the cache, but still occupies space.

Oct 03 '23 21:10 arthurprs

My apologies if I haven't been clear. Yeah cache pinning is essentially the feature I'm looking for 😄

My proposal was along the lines of the following changes:

https://github.com/arthurprs/quick-cache/blob/d1c7d96811eb4dd318857a6bd943c7f027469380/src/lib.rs#L120

pub trait Lifecycle<Key, Val> {
//...
    fn before_evict(&self, state: &mut Self::RequestState, key: &Key, val: &mut Val) -> bool {true}
//...
}

https://github.com/arthurprs/quick-cache/blob/d1c7d96811eb4dd318857a6bd943c7f027469380/src/shard.rs#L514-L519

            if !self.lifecycle.before_evict(lcs, &resident.key, &mut resident.value)
                || self.weighter.weight(&resident.key, &resident.value) == 0 {
                self.cold_head = Some(next);
                return;
            }

https://github.com/arthurprs/quick-cache/blob/d1c7d96811eb4dd318857a6bd943c7f027469380/src/shard.rs#L786-L791

        // don't admit if it won't fit within the budget
        if weight > self.weight_capacity && self.lifecycle.before_evict(lcs, &key, &mut value) {
                // Make sure to remove any existing entry

So to my understanding you would currently write something like this (tbh after writing this I'm even less sure on how to use this API correctly 😅):

fn before_evict(&self, state: &mut Self::RequestState, key: &Key, val: &mut Val) {
   val.weight = if Arc::strong_count(val) == 1 {0} else {val.len()};
}
//...
fn weight(&self, key: &Key, val: &Val) -> u32 {
   val.weight
}

whereas you would then write:

fn before_evict(&self, state: &mut Self::RequestState, key: &Key, val: &mut Val) -> bool {
  Arc::strong_count(val) == 1 
}
//...
fn weight(&self, key: &Key, val: &Val) -> u32 {
  val.len()
}

Omitting the additional state in Val.

I'm not sure what you mean by cold/hot changes, you mean in cases where you have a cache hierarchy, e.g. ram, disk, stuff reachable over the network, and the coldness/hotness of the next cache levels changes?

But like I said I might be understanding the whole goal and API wrong, and the weight isn't ephemeral after the before_evict but implicitly 0 by how the clock hands advance or something. I'm currently reading into the CLOCK-Pro paper, so maybe I'll be a bit wiser afterwards 😄

Oct 04 '23 14:10 somethingelseentirely

Yeah, your example really cements that it's pinning. Like:

trait Lifecycle {
 
  fn is_pinned(&self, state: &Self::RequestState, key: &Key, value: &Val) {
      Arc::strong_count(val) != 1
  }
}

The only drawback is that pinning complicates the internal cache logic, as now it's not guaranteed to be able to move things around (resulting in infinite loops, etc..). It's fixable, but will require some work.

I'm not sure what you mean by cold/hot changes, you mean in cases where you have a cache hierarchy, e.g. ram, disk, stuff reachable over the network, and the coldness/hotness of the next cache levels changes?

Inside the cache, items have different "hotness". So theoretically an item can be compacted (or moved to disk) when they're cold or something. In practice, it's quite hard to achieve this in a generic component like QuickCache.

Oct 04 '23 14:10 arthurprs

Inside the cache, items have different "hotness". So theoretically an item can be compacted (or moved to disk) when they're cold or something. In practice, it's quite hard to achieve this in a generic component like QuickCache.

Gotcha!

The only drawback is that pinning complicates the internal cache logic, as now it's not guaranteed to be able to move things around (resulting in infinite loops, etc..). It's fixable, but will require some work.

Yeah I've also been wondering what happens if there's only 0 weight elements.

I'm gonna read into the paper and code, and try to contribute a pinning patch if you're interested 😄?

Oct 04 '23 15:10 somethingelseentirely

I'm gonna read into the paper and code, and try to contribute a pinning patch if you're interested 😄?

100%

Yeah I've also been wondering what happens if there's only 0 weight elements.

There's also the edge case that the cache can't free up any space because things are pinned :cry:

Oct 05 '23 12:10 arthurprs

quick-cache quick-cache copied to clipboard

`before_evict` + `weight` API is stateful/awkward (it can't effectively do pining)

quick-cache
quick-cache copied to clipboard