scylla-rust-driver icon indicating copy to clipboard operation
scylla-rust-driver copied to clipboard

Optimize CachinSession

Open wyfo opened this issue 2 years ago • 2 comments

CachingSession is currently suboptimal, especially when using it &'static str (and I suppose it's the main use case), mostly for two reasons:

  • Into<Query> implies a string cloning in case of &str; however, making the query is not necessary to perform cache lookup, only the string reference is necessary
  • cache stores raw PreparedStatements, which have to be cloned when retrieved (and PreparedStatement is quite heavy to clone).

I've implemented an optimization to solve this issue, in which I've made the following modifications:

  • introduce trait QueryString {fn query_string(&self) -> &str;} and use query: impl Into<Query> + QueryString in CachingSession::Execute (and others) signature; cache lookups are performed using query.query_string(), and query.into() is only used when the statement must be prepared
  • store Arc<PreparedStatement> into the cache

I've done some benchmarks (using my old i7 4650U MacBook Air), in which I've measured only one prepared statement retrieval (no serialization and database access) from &'static str. Here are the results :

query string size current optimized
20 390 ns 205 ns
100 450 ns 250 ns

I let you judge if the optimization could be worth it. The implementation is just a draft and maybe adapted (I can open a PR).

wyfo avatar Jul 18 '22 13:07 wyfo

Kind of duplicate of #475, but implementations are quite different

wyfo avatar Jul 18 '22 13:07 wyfo

Hey, I was about to open an issue around the fact that &'static str is the default case in most instances, or at least it should be (since if you're building queries with format strings you're potentially going to have injections), and therefor Query should accept a &str. This is particularly nice if you're calling the same query many times but with different parameters.

I also agree that the cache should hold Arc<PreparedStatement> to make it cheaper to clone.

I have a couple of notes on your code: https://github.com/wyfo/scylla-rust-driver/blob/optimize_caching_session/scylla/src/transport/session.rs#L595

prepare is a bit unfortunate in that it takes impl Into<Query> but never actually needs ownership of Query. That's because Connection::send_request borrows the request. So at no point is an owned buffer actually necessary at all.

The main benefit of taking impl Into<Query> is that it lets you provide static strings, but it means that every called to impl Into<Query> is forcing an allocation. That's still happening with the approach you've taken.

This is why I went down the route of making impl Into<Query> cheap. Unfortunately, I've lost that code. Whoops.

I'd like to hear from the Scylla devs on this, if there's any interest or not. I think there are a few things to do here:

  1. Moving PreparedStatement into Arc should likely be its own PR, it should be a very straightforward win. There is a clone of PreparedStatement for every add_prepared_statement call, whether it was previously prepared or not. PreparedStatement is a very large structure that itself has many pointers, including Arc<dyn _> pointers, so this should just strictly be a win.

  2. Determining what amount of API breakage, if any, is acceptable. In the approach I took I had a Cow<'a, String> in order to minimize API breakage. I think optimizing for the &'static or &'a cases makes the most sense for two reasons.

  • Most queries should be static strings, or otherwise references
  • At no point does Scylla's driver require owning the bytes. This makes sense when you consider that the query is going to be compressed into a separate buffer, meaning it never needs to manage the lifetime internally

The changes to to avoid those clones and copies would be:

  • Changing Query to Query<'a>
  • Modifying Session::prepare to not require ownership of the Query. This will help in the case where a String is used as the buffer.

I took a quick pass at redoing the work. https://github.com/scylladb/scylla-rust-driver/compare/main...colin-grapl:scylla-rust-driver:remove-allocations

This is a breaking API change. In particular, if you were calling .into() on queries before passing them into execute, you have to not do that. I have also added lifetimes to a number of structs.

colin-grapl avatar Oct 13 '22 11:10 colin-grapl