drift icon indicating copy to clipboard operation
drift copied to clipboard

Improve performance of `InsertStatement.createContext` in batches

Open btrautmann opened this issue 1 year ago • 3 comments

Currently createContext is one of the more recognizable bottlenecks when doing large batches/insertions with drift. In talking with @simolus3 it sounds like there are some improvements that could be made here to speed this up.

Describe the solution you'd like

There's no exact target, but here's a CPU Flame chart to highlight the impact of createContext's current performance: Screenshot 2024-07-23 at 15 54 18

btrautmann avatar Jul 23 '24 19:07 btrautmann

To add context, the optimization I had in mind was to hash companions being inserted in batches by which columns they have present values. At the moment, when we insert a large amount of rows that have a similar "shape" (in the sense that we generate the same SQL for them, only the bound variables are different), we build the SQL again for every single row. For a five-column table CREATE TABLE tbl (a,b,c,d,e) for example, what we could do is represent the present values on a companion in a bitfield. A companion with all values set would be represented with 31, which can serve as a key for a cache mapping companions to the SQL already built for them. Now, inserting say 10k of these companions only requires building their bitfield and extracting Variable instances 9999 out of the 10000 times. With the current implementation, every single operation requires building SQL, which is more expensive (lots of string buffer operations).

simolus3 avatar Jul 23 '24 22:07 simolus3

Moving some parts of the operation to Rust could help us with this.
Native Assets is a new experimental dart feature which would allow us to do this without depending on flutter.

Isar depends on flutter and uses rust, we could move some limited parts if @simolus3 is down to clown

dickermoshe avatar Aug 12 '24 00:08 dickermoshe

I don't see what performance benefits that would unlock to be honest, since we'd still have a Dart API responsible for expressing queries. The parts of Isar that are written in Rust correspond to sqlite3, which is already a native library. The way it seems, the key is doing less duplicate work when building queries. I don't think rewriting parts alone is going to help substantially, Dart is not intrinsically slow.

simolus3 avatar Aug 12 '24 15:08 simolus3