serenity
serenity copied to clipboard
Cache Redesign
This is a tracking issue for feature requests and ideas for the cache system redesign i'll be taking a look at and working on when 0.6.0. First mission is probably using DHashMap
from the ccl
crate instead of regular RwLocked
hashmaps. Design proposals should also go here.
The implementation for the suggested alternative to HashMap
can be found at https://gitlab.nebulanet.cc/xacrimon/ccl/blob/master/src/dhashmap.rs
How about we make the caching calls more explicit? That would mean passing cache impl as an argument to every call/constructor explicitly. I’m not am a fan of magic...
To proper understand the problem though, maybe we could outline what we want to cache (and what’re we already caching right now)?
That has been done with serenity 0.6.x or well you pass a context which has a couple of other things as well but it then implements both AsRef<Http>
and AsRef<Cache>
.
Also regrading what we want to cache, we want to cache pretty much everything that we can. We can cache messages but that is disabled by default.
I'm going to draw up a Cache trait when I get home. Hold tight till then.
Alright. So we want to abstract the cache API away from the actual methods used to store the data. I think the best way to do this is define a CacheBackend
trait that we implement for different backends. We then take the trait object and wrap it like this
pub struct Cache {
backend: Box<dyn CacheBackend>,
}
Finally we implement wrapper methods that call the backend.
This creates a clean solution and makes it easy to swap the cache if you so desire. This means that we can first refactor the cache system with a tested map like hashbrowns and then look into a better backend like DHashMap
.
Does this sound alright?
The new Cache struct would sit in the same place as the current cache currently does.
I think it's a good idea to have an abstract cache layer, so I'm for it. That said, I need to get up to date with the current cache API, cause it seems like a lot has changed.
Regarding what's cached, I have a very rough idea on whether it's possible to do an efficient cache invalidation in a distributed scenario. So, if we want to cache everything - that means the data sources for cache are both the HTTP responses and date that's retrieved through the websocket connections, right?
Yup @MOZGIII. Those are the data sources. I'm not home yet but I'm going to draw up a proposition for the cache trait sometime today. This abstract layer means we can plug lots of backends in. Everything from a set of compact DHashMaps
to something like memcached for distributed caching.
The current cache api is just a RwLocked struct containing a set of HashMaps which makes in very inflexible and somewhat slow and limited by concurrency.
Alright. I have a rough design proposal on how to glue the abstract layer together with everything else. It looks somewhat like this. This is not a complete example but the reader should get the gist of then design.
pub struct ChannelVirtualMap<'a>
where
T: CacheBackend,
K: Hash + Eq,
{
cache: &'a Cache,
}
impl<'a> ChannelVirtualMap<'a>
where
T: CacheBackend,
K: Hash + Eq,
{
pub fn insert(&'a self, id: ChannelId, val: Arc<RwLock<GuildChannel>>) {
self.cache.channels_insert(id, val);
}
// add other methods
}
pub struct Cache {
backend: Box<dyn CacheBackend>,
}
impl Deref for Cache {
type Target = Box<dyn CacheBackend>;
fn deref(&self) -> &Box<dyn CacheBackend> {
&self.backend
}
}
pub trait CacheBackend {
fn channels_list(&self) -> impl Iterator<Item = (ChannelId, Arc<RwLock<GuildChannel>>)>;
fn channels_get(&self, ChannelId) -> Option<&Arc<RwLock<GuildChannel>>>;
fn channels_insert(&self, ChannelId, Arc<RwLock<GuildChannel>>);
fn channels_get_mut(&self, ChannelId) -> Option<&mut Arc<RwLock<GuildChannel>>>;
fn channels_map(&self) -> ChannelVirtualMap
// add more methods as necessary
// repeat for other things we want cache. adjust the virtualmap type accordingly (hashsets etc) or not have not at all depending on the item. Add other virtual types for a smooth interaction between the cache and the other code. The virtual types should make adjusting the rest of the library significantly easier.
}
What's the point of having a ChannelVirtualMap
?
Since the current API uses a hashmap. Having a virtulmap in its place would probably make switching easier.
Also the methods should probably return a Deref<[mytype]>
or an AsRef<[mytype}>
instead.
Me and Erk are going to revisit this once 0.6 ships and he has free time.
I don't think we want such a redesign. Here's what I can think of that users might want from custom caches:
- custom caching policies - can be implemented using callbacks or otherwise runtime configuring the built-in Cache
- only caching resources they need to - implemented in #2210 and can be expanded upon
- only caching model type fields they need to (cc @GnomedDev) - won't work with the Cache trait approach anyways, because the trait methods return whole model type instances, so you can't selectively cache only certain fields
- more efficient data structures - if someone can optimize the data structures, they should contribute to upstream serenity
Anyone else want to chime in? This issue has had no activity for 4 years and I feel like none of the serenity devs are interested in redesigning the cache (correct me if I'm wrong?), so maybe we can close this issue
I'm not longer stripping models in a fork because it's a complete pain to update, but unless there was a feature flag per field I cannot imagine caching only the model fields that are needed would be possible.