exqlite icon indicating copy to clipboard operation
exqlite copied to clipboard

Replace DirtyNIF execution model with a different mechanism

Open warmwaffles opened this issue 3 years ago • 56 comments

The dirtynif execution model is okay for most cases, but we have an issue where long running writes need to timeout / be interrupted mid execution and aborted. The only way to do that is to utilize https://sqlite.org/c3ref/progress_handler.html

warmwaffles avatar Feb 02 '22 16:02 warmwaffles

I wonder if you may be able to stick with the process model as long as you assume each instance of the database is backed by one process. This way you make Elixir processes your thread pool, instead of having to deal with it in C/C++/Rust.

josevalim avatar Jan 19 '23 17:01 josevalim

@josevalim I think that's a good idea where only one process in the supervision tree sits there waiting for another DbConnection call. This could make it easier to cancel existing running queries.

Let me noodle on this a little more.

warmwaffles avatar Jan 19 '23 17:01 warmwaffles

Right, I think in this sense DBConnection may be getting more in the way than helping. Because you need serial access around SQLite3 and having a pool around serial access is not giving you anything. And when you don't need serial access (i.e. read operations), you don't need a pool at all.

DBConnection is necessary for Ecto but, even then, it should not be a requirement. There are only three features in Ecto.Adapters.SQL that require DBConnection (sandbox, stream, and disconnect_all) but they are all addressable.

Then there is the part of DBConnection as a convenience. I can think of two features it gives:

  1. a transaction manager - if the process doing a transaction crashes, we need to abort it. but perhaps sqlite3 api itself already has conveniences for handling dangling transactions
  2. the ownership pool

Of course there is a lot of work here, so I am not saying we should yank DBConnection, but it may be worth considering moving the DBConnection bits to ecto_sqlite3. In this project, based on the analysis above (which may be incomplete), you would only be missing the transaction manager (and it may be provided by sqlite3).

josevalim avatar Jan 20 '23 07:01 josevalim

There might be an additional benefit to managing transactions in a sqlite specific way. Currently there's no information returned to changesets about which constraint was violated, but there seems to be some pragmas to figure those out as sqlite doesn't close transactions on errors.

What I'm wondering about @josevalim's suggestion is how using processes to serialize or not serialize access would work for transactions which only become write transactions mid execution. Wouldn't the suggested approach require knowing if a transaction does writes or not in advance?

LostKobrakai avatar Jan 24 '23 09:01 LostKobrakai

~~Wouldn't having one queuing process per db instance negate WAL benefits? There still needs to be a pool of connections for concurrent reads. It's not possible to open a single connection and start multiple transactions on it. And if we try to work around that with snapshots, we'd start losing isolation.~~

ruslandoga avatar Feb 09 '23 10:02 ruslandoga

@ruslandoga the reads can go directly to the NIF. Only the writes and operations inside a transaction would go through the process.

josevalim avatar Feb 09 '23 11:02 josevalim

~~But how exactly would reads go directly to the NIF? Using what database connection? It can't be the one from the "writer" process since it might be in the middle of a transaction, and then we need synchronisation of reads and writes.~~

Ah, I think I understand now. Since the reads are "scoped" to a prepared statement, it might be possible to reuse a single database connection for multiple statements at once. I'll try it out.

UPDATE it works: https://github.com/ruslandoga/exqlite/pull/3; and reads seemingly became faster

iex(1)> {:ok, conn} = Exqlite.start_link(database: "demo.db", debug: [:trace])
{:ok, #PID<0.245.0>}
iex(2)> Exqlite.write(conn, "create table users(name text not null) strict")
# *DBG* <0.245.0> got call checkout from <0.252.0>
# *DBG* <0.245.0> sent ok to <0.252.0>, new state {[],<0.252.0>}
# *DBG* <0.245.0> got cast checkin
{:ok, []}
# *DBG* <0.245.0> new state {[],nil}

iex(3)> Exqlite.write(conn, "insert into users(name) values(?)", ["john"])
{:ok, []}
iex(4)> Exqlite.write(conn, "insert into users(name) values(?)", ["david"])
{:ok, []}
iex(5)> Exqlite.write(conn, "insert into users(name) values(?)", ["peter"])
{:ok, []}

iex(6)> {:ok, task_sup} = Task.Supervisor.start_link
{:ok, #PID<0.254.0>}

iex(7)> Task.Supervisor.async_stream_nolink(task_sup, 1..10000, fn _ -> Exqlite.query(conn, "select * from users") end, max_concurrency: 1000) |> Enum.into([])
[
  ok: {:ok, [["john"], ["david"], ["peter"]]},
  ok: {:ok, [["john"], ["david"], ["peter"]]},
  ok: {:ok, [["john"], ["david"], ["peter"]]},
  ok: {:ok, [["john"], ["david"], ["peter"]]},
  ok: {:ok, [["john"], ["david"], ["peter"]]}
# ...

I wonder how the statements cache would work with this approach. It'd need to be serialised somehow as well.

ruslandoga avatar Feb 09 '23 11:02 ruslandoga

I think the tricky thing would be supporting concurrent reads from within transactions, which didn't yet do writes: https://www.sqlite.org/lang_transaction.html#deferred_immediate_and_exclusive_transactions

Not sure how exqlite could know which statement within a transaction requires a SHARED LOCK (reads) vs one of the ones required for writing. Maybe it could naively check for SELECT … in the sql.

LostKobrakai avatar Feb 09 '23 12:02 LostKobrakai

@LostKobrakai I think all transactions would need to go through the queue. Even if a read transaction never becomes a write one, it still can't be executed without syncing with the writers first. Otherwise multiple transactions can be attempted:

BEGIN; -- from writer in process 1
INSERT INTO ...
INSERT INTO ...

                 BEGIN; -- a reader starts a transaction in process 2
                 -- SQLite complains about re-entry: "cannot start a transaction within a transaction"

COMMIT; -- writer finishes the transaction from process 1

This is actually what I was worried about in the comments above. But since reads don't have to be wrapped in transactions, single connection approach works.

It might be possible to do snapshots though, for read-only transactions without BEGIN / COMMIT, at the NIF level.

ruslandoga avatar Feb 09 '23 12:02 ruslandoga

I just tried this with two table plus instances on the same database and it allowed me to start a read transaction just fine while another write transaction was started. Trying to start a write transaction I got database is locked. Though I guess that's two "connections" to sql, is it?

LostKobrakai avatar Feb 09 '23 13:02 LostKobrakai

two table plus instances

That means two connections. I think the approach discussed in this issue is having only a single connection open to the database.


$ sqlite3 demo.db

SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
sqlite> begin;
sqlite> begin;
Runtime error: cannot start a transaction within a transaction

ruslandoga avatar Feb 09 '23 13:02 ruslandoga

Given DBConnection is an abstraction over connections I'm not so sure. I'd for sure appreciate if it actually could do concurrent reads (in a transaction) given how much code in a business application requires to be run in a transaction, which usually start reading a bunch of things before they write.

LostKobrakai avatar Feb 09 '23 13:02 LostKobrakai

I think there is danger of those transactions failing with SQLITE_BUSY so serialising all transactions seems like a safer way.

-- conn 1                                            -- conn 2
sqlite> begin;
                                                     sqlite> begin;
                                                     sqlite> select * from users;
sqlite> delete from users where name = 'patrick';
                                                     sqlite> insert into users(name) values('jacob');
                                                     Runtime error: database is locked (5)

If it's about performance, we can try coming up with some benchmarks to see at what write/read ratio a single connection approach wins / loses. My hot take is that (read)write transactions are rare and reads are common and having un-pooled reads would end up being faster on average. Better memory usage too, probably.

ruslandoga avatar Feb 09 '23 13:02 ruslandoga

That's why I was wondering if exqlite could catch writes before they queue for "BUSY WAIT", but the queuing happens in elixir land. In the end this wouldn't be much different to e.g. a timeout due to a to long running query with e.g. postgres/ecto. Another option would be possible if exqlite can timeout a transaction from the outside and setting busy_timeout to a value larger than elixirs internal timeout.

LostKobrakai avatar Feb 09 '23 13:02 LostKobrakai

I don't understand. How would busy_timeout help in the above example? No matter how long we wait, if we started a deferred transaction before some other connection made a write to a table we read from, we won't be able to commit it even if the other connection's writer is finished.

-- conn 1                                            -- conn 2
sqlite> begin;
                                                     sqlite> begin;
                                                     sqlite> select * from users;
sqlite> delete from users where name = 'patrick';
sqlite> commit;
                                                     sqlite> insert into users(name) values('jacob');
                                                     Runtime error: database is locked (5)

We can however remove all queueing in Elixir and rely on what SQLite returns (error code or ok), but that seems more difficult to develop for. One would need to know and understand SQLite specifics. ~~And we probably wouldn't be able to rollback exiting writers~~ or provide :checkout_timeout per query since pragma busy_timeout is global (I think).


I think busy_timeout would make concurrent read+write begin immediate; transactions work by queuing them, but that'd disable concurrent reads as well.

ruslandoga avatar Feb 09 '23 13:02 ruslandoga

That's unexpected, I would've expected this to work based on the documentation on the topic. If this is the case I'd wonder why have busy_timeout in the first place, if waiting makes no sense.

LostKobrakai avatar Feb 09 '23 13:02 LostKobrakai

What if we try updating a record we read but that since has been deleted? I don't think it can work at all.

-- conn 1                                            -- conn 2
sqlite> begin;
                                                     sqlite> begin;
                                                     sqlite> select rowid from users where name = 'john';
sqlite> delete from users where name = 'john';
sqlite> commit;
                                                     sqlite> update users set name = 'john jr.' where rowid = 1;
                                                     Runtime error: database is locked (5)

ruslandoga avatar Feb 09 '23 13:02 ruslandoga

It there are conflicting changes I think most databases will roll back one transaction. Though deleting one row and adding another shouldn't really be conflicting, but that might just be sqlite.

LostKobrakai avatar Feb 09 '23 14:02 LostKobrakai

I'm back to thinking that a single connection with un-pooled reads wouldn't work. If we issue reads directly to the NIF while a writer is in a transaction, we lose isolation guarantees as we can read data from unfinished transaction.

iex> spawn(fn ->
  Exqlite.transaction(db, fn conn ->
    Exqlite.query(conn, "insert into users(name) values(?)", ["new"])
    :timer.sleep(:timer.seconds(1000))
  end)
end)
iex> Exqlite.query(db, "select * from users")
{:ok,
 [
   ["david"],
   ["peter"],
   ["new"]
 ]}

I'm thinking now about using two connections, one for writers and one for readers. The readers connection would accept un-pooled reads. But even when un-pooled, reads would still limited by the number of dirty schedulers.


but that might just be sqlite.

Yeah, I think in the stock version SQLite "marks" whole database as dirty (or rather it compares the transaction ids when it started and at the first write) if it has been modified (outside of the current transaction) since the transaction has started. There is a begin-concurrent branch that allows marking pages as dirty (but that still doesn't help when the databases are small and fit in a few pages), and now I think a high-concurrency branch is in the works with row-level locking like in the other dbs.

ruslandoga avatar Feb 09 '23 16:02 ruslandoga

I'm thinking now about using two connections, one for writers and one for readers. The readers connection would accept un-pooled reads.

That didn't end up working either. Reads start an implicit transaction and that's a problem since concurrent reads can force that transaction to be indefinitely open and that connection would never get new data.

conn1 (readonly)                                            conn2 (writeonly)
prepare(stmt1)
step(stmt1) <- tx opened                                                 
                                                            insert row
step step step(stmt1) <- no row, ok
prepare(stmt2) 
step step step (stmt2) <- no row
finish(stmt1) <- tx not closed since stmt2 still stepping
prepare(stmt3) <- no row, not ok since current tx was opened for stmt1                                                                          

ruslandoga avatar Feb 24 '23 13:02 ruslandoga

Sorry for staying out of this conversation and thanks for exploring @ruslandoga.

It seems like we would need the process to work as read-write lock. While reads are happening, we allow them to run concurrently. Once a write arrives, we start queueing events and draining the reads. Once they are all drained, execute the write, and then move to the next in the queue. If next is a read, we start doing concurrent work again. All queries can still run on the client. You can find an implementation of serialized reads-writes from NIF running from the client here: https://github.com/elixir-explorer/adbc/blob/main/lib/adbc_connection.ex

Having a concurrent reads version should an expansion of the API above (but hey, you can even start with a full serialized version and add the read concurrency later). API wise, you could keep "query" as a write operation but you can introduce a "read_query" where the user guarantees they are exclusively reading.

In any case, the main point is that I think this library can be cleaner and easier to maintain if it doesn't try to do NIFs, process management, and DBConnection at the same time. I think NIFs + process management would make this usable across more scenarios and DBConnection is an Ecto concern (which perhaps we can move all the way up to Ecto itself).

josevalim avatar Jul 13 '23 12:07 josevalim

@josevalim @warmwaffles if there are no other takers, I'm interested in exploring this approach :)

ruslandoga avatar Jul 13 '23 13:07 ruslandoga

Definitely go for it from my side - but that's @warmwaffles decision to make. :)

josevalim avatar Jul 13 '23 13:07 josevalim

Yea sure why not. @ruslandoga if you want to give it a shot I'm all for it. I'm really swamped with day job work at the moment, but I have time to do reviews and testing implementations.

warmwaffles avatar Jul 13 '23 16:07 warmwaffles

I've started a PR https://github.com/elixir-sqlite/exqlite/pull/255 for https://github.com/elixir-sqlite/exqlite/issues/192#issuecomment-1634198546 and while working on it I had one concern. With this approach we miss out on WAL allowing multiple readers when there's a writer, but I guess it could be tackled in ecto_sqlite3 by starting two connections and, for example, dedicating one for writes and the other for reads. To avoid https://github.com/elixir-sqlite/exqlite/issues/192#issuecomment-1443701807 these connections would be "cooperative" so that after a write happens, the read connection would be drained from reads to close the implicit transaction and allow for the new data to become visible.

ruslandoga avatar Jul 19 '23 12:07 ruslandoga

Before I can answer, quick question: can we perform concurrent reads over the same single connection in SQLite3? Or the reads have to be serialized?

josevalim avatar Jul 19 '23 12:07 josevalim

Yes, we can perform concurrent reads, the "result rows" are "scoped" per statement. So as we are stepping through two different prepared statements in the same SQLite connection from different processes, we don't get interleaved results and all is fine :)

Here's a test case that is supposed to ensure the reads are concurrent: https://github.com/ruslandoga/exqlite/blob/c07b537424a62b3eb756988e0b2a2c00effbe970/test/exqlite/rw_connection_test.exs#L119-L139

The only problem is that these statements open implicit transactions which means that unless all statements are "done" (released or reached the end of rows), no new data is visible over that connection (e.g. if it has been inserted over another connection).

ruslandoga avatar Jul 19 '23 12:07 ruslandoga

I see. So we need to answer how low-level vs high-level we want to be. SQlite3 doesn't handle any of this, so we could not. But if we want to provide high-level abstractions, then it makes sense to provide a high-level abstraction that would also be used by EctoSQLite3. So my suggestion is to have the RWLock start both connections (we could make the Read transaction be lazy in the future if that's ever an issue). query goes through the write one, read_query goes through the read one.

josevalim avatar Jul 19 '23 13:07 josevalim

SQLite AFAIK can handle concurrent reads, but the moment a write comes along, things get tricky. It also depends on how the database's journal_mode is set. If it is set to wal, sqlite will do its best to resolve the writes that happen concurrently.

I'm not sure how it performs when the database is opened with :memory: when trying to do concurrent writes. I'm fairly certain it'll handle that just fine since it would all be in process and be able to have the appropriate locks in process.

warmwaffles avatar Jul 19 '23 13:07 warmwaffles

So my suggestion is to have the RWLock start both connections (we could make the Read transaction be lazy in the future if that's ever an issue). query goes through the write one, read_query goes through the read one.

I might have misunderstood it a bit but here's a PR https://github.com/elixir-sqlite/exqlite/pull/256

Right now it opens two SQLite3 DBs in the same Elixir process, one for locked query and the other for concurrent read_query.

My concern with doing that is it seems a bit sneaky. I'm trying to come up with scenarios where it can bite the user:

  • :memory: databases
  • :mode != :wal
  • https://github.com/elixir-sqlite/exqlite/pull/256#issuecomment-1645026108
  • I feel like there might be something else, like builds or setups where reading from another connection wouldn't work well

Maybe Exqlite could provide multiple connection types:

  • Exqlite.ReadConnection -- concurrent reads, which are drained on GenServer.cast(conn, :drain)
  • Exqlite.Connection -- all commands are serialised
  • Exqlite.RWConnection -- similar to https://github.com/elixir-sqlite/exqlite/pull/256 but more of a proxy/supervisor that starts both of the above and also exports read_query/3 (the two above only have query/3)

This would make Exqlite.Connection a safe default and Exqlite.RWConnection a performant choice for Ecto.Adapters.SQLite3.

ruslandoga avatar Jul 21 '23 05:07 ruslandoga