donutdb icon indicating copy to clipboard operation
donutdb copied to clipboard

This is so cool!! Can I help?

Open cweagans opened this issue 2 years ago • 1 comments

I don't know how this project hasn't gotten more attention. This is such a neat way to use sqlite! It's exactly what I've been looking for for random stuff I want to get set up on Lambda.

I know Go pretty well and I'm reasonably familiar with DynamoDB, but I'm somewhat new to sqlite internals and such. Do you have a todo list for this project somewhere that I can help with somehow?

In particular, I'm very interested in supporting lots of concurrent readers. One approach I thought about was to use DynamoDB replication to replicate the table into a separate read-only copy of the database and in Donut, somehow designate that second table as read-only. Sort of the same idea as having a read replica on MySQL or Postgres: I'm not sure on the DonutDB specifics, but I think you wouldn't have to do any locking if there was a guarantee that Donut would only be reading from the DB, right?

cweagans avatar May 29 '22 20:05 cweagans

Glad you like it.

If you have a copy of the database that you know won't be changed, you can open the database in mode=immutable. That disables all locking and will give you better read performance. We set that automatically in sqlite3vfshttp which makes a significant difference. If you snapshot and replicate the dynamodb database to a read only copy, I would expect this to give you much better read performance.

Another option would be to implement a more advanced locking strategy in dynamodb itself. Currently we use a simple global lock that serializes all access to the database (akin to the dot-file lock strategy built into sqlite). I think it would be feasible to implement a more advanced lock strategy that allows for multi-reader single writer access. You'd need a partition that would store the set of active locks and then ensure changes to that partition happen atomically. I'd be curious to see how much additional overhead this adds to queries.

Otherwise, I think there are a number of areas where we could improve performance. I have a few branches where I was working on this some last year. Obviously we need to be extremely careful to not introduce data corruption bugs with any optimizations we add.

I found that testing against the local java dynamodb server behaves fairly differently than running against the real dynamodb. You'll definitely want to test against the real dynamodb to see how your changes work when you have non-negligible network latencies. Just something to be careful of.

psanford avatar May 30 '22 18:05 psanford