deadpool icon indicating copy to clipboard operation
deadpool copied to clipboard

Policies for removing prepared statements from the cache

Open bikeshedder opened this issue 1 year ago • 1 comments

It would be nice if there was a way to limit the amount of queries inside the statement cache so that it becomes a feasible option for use with query builders.

There are various policies for evicting items from a cache:

  • https://en.wikipedia.org/wiki/Cache_replacement_policies

Two policies seam like a good fit for the statement cache:

  • Least recently used (LRU)
  • Least frequently used (LFU)

I was also thinking about adding a second parameter to prepare_cached making it a bit more obvious what's actually happening and giving the user a way to fine tune the cache eviction policies:

/// The cache priority is used whenever the statement cache is cleaned,
/// either by returning the connection to the pool or by manually calling
/// `StatementCache.clean()`.
#[derive(Debug, Default)]
pub struct CachePriority(usize);

impl CachePriority {
    const HIGH: CachePriority = CachePriority(usize::max_value());
    const LOW: CachePriority = CachePriority(usize::min_value());
}

impl Client {
    ...
    fn prepare_cached(&self, query: &str, prio: CachePriority) { ... }
    ...
}

As cache eviction is not necessarily cheap I think it makes sense to only run it when returning the connection to the pool and/or if manually triggered by the user. e.g. StatementCache.clean()


Do we even need this? Or does prepare_cached just need a huge warning sign explaining why you should not use it for any kind of dynamic or one-time queries?

bikeshedder avatar Nov 09 '23 10:11 bikeshedder

@bikeshedder I think the flexibility to define a Cache Replacement Policy for a given Pool makes a lot of sense, especially if we can define things like the max number of queries in a cache and the max size of a cache, such as can be configured using the Postgres JDBC , i.e. preparedStatementCacheQueries and preparedStatementCacheSizeMiB. Then, we can reliably predict on how much memory footprint our prepared statement cache will take up in our database and how to expect the cache to be pruned throughout its lifecycle.

That being said, I also think it would be a good idea to put a huge warning sign around prepare_cached. Throughout our troubleshooting of our memory issue, I struggled to find good documentation around best practices of prepare vs. prepare_cached and the fundamental differences between them and what they're doing under-the-hood. I think more information (and as you mention, a large warning\disclaimer) would go a long way for users of this crate.

burkematthew avatar Nov 09 '23 14:11 burkematthew