deno-sqlite
deno-sqlite copied to clipboard
IO is extremely slow
Running this bench
import { DB } from 'https://deno.land/x/[email protected]/mod.ts';
function insert(db: DB, s: string): void {
db.query('INSERT INTO tbl ( value ) VALUES ( :value )', { value: s });
}
function bench(filename: string, n: number = 10000): void {
const db = new DB(filename);
db.query('CREATE TABLE tbl ( id INTEGER PRIMARY KEY AUTOINCREMENT, value TEXT )');
console.log(`start ${filename} w/ ${n} insertions`);
const a = Date.now();
for (let i = 0; i < n; i++) {
insert(db, `${i}`);
}
const b = Date.now();
const dt = b - a;
console.log(`total ${dt} ms, ${dt / n} ms per query`);
db.close();
}
bench(':memory:');
bench('test.db', 10);
Results in
start :memory: w/ 10000 insertions
total 625 ms, 0.0625 ms per query
start test.db w/ 10 insertions
total 1322 ms, 132.2 ms per query
When profiling, it appears that the majority of the time is spent within Deno internals, namely opSync
, so I don't know if this fixable on the library end, though.
This might be related to #165, are your running your tests on a Mac?
Nope, this is Fedora Linux.
Hm yeah. Not sure what we can do outside of making deno itself faster? We could try to buffer writes internally and only go to deno when sqlite requests a sync? (We’d definitely have to benchmark this though)
We could try to buffer writes internally and only go to deno when sqlite requests a sync?
This would be great. I have a project where batching inserts in transactions is impractical due to the values of insertions being dependent on querying the values which came before it. I'd love not to have to write my own buffer abstraction.
I can reproduce similar times on my Windows 10 laptop with an HDD (~0.03 ms per query in-memory and ~170 ms per query writing to disk).
For comparison, I wrote a (hopefully) equivalent benchmark for python's built-in sqlite3:
import sqlite3
import time
import os
def insert(db, s):
db.execute('INSERT INTO tbl ( value ) VALUES ( :value )', { "value": s })
def bench(filename, n = 10000):
db = sqlite3.connect(filename, isolation_level=None);
db.execute('CREATE TABLE tbl ( id INTEGER PRIMARY KEY AUTOINCREMENT, value TEXT )')
print(f'start {filename} w/ {n} insertions')
a = time.perf_counter()
for i in range(n):
insert(db, f'{i}')
b = time.perf_counter()
dt = (b - a) * 1000
print(f'total {dt} ms, {dt / n} ms per query')
db.close()
os.remove('test.db')
bench(':memory:');
bench('test.db', 10)
isolation_level=None
should enable autocommit, which as far as I understand matches the default behavior of x/sqlite.
With this benchmark in python I got times of ~0.006 ms per query in-memory and ~170 ms per query writing to disk. Which is a practically identical time for the IO bound case. This seems to suggest that while the IO is a bottleneck it is not a problem with this specific sqlite library or deno but rather a general issue running sqlite in autocommit mode without WAL (it is worth noting that if I understood the documentation correctly autocommit is disabled by default on python's sqlite3 library).
For reference, enabling WAL in python reduced the time of this benchmark to an average of ~40 ms per query on my laptop.
Interesting! Enabling WAL is tricky, since that requires memory-mapping the file; (although we might be able to make it work by hacking around with a shared WASM buffer? But it seems tricky to get right while preserving the reliability guarantees of SQLite.
It might be possible to enable WAL, if we use EXCLUSIVE locking: https://sqlite.org/wal.html#noshm (given that we don’t have a good way to get mapped files in deno for now)
There are also other ways to make INSERTs faster, read: https://stackoverflow.com/questions/1711631/improve-insert-per-second-performance-of-sqlite
E.g. if you do:
db.transaction(() => {
for (let i = 0; i < n; i++) {
insert(db, `${i}`);
}
});
Then [0.0059 ms per query]:
start :memory: w/ 10000 insertions
total 67 ms, 0.0067 ms per query
start test.db w/ 10000 insertions
total 59 ms, 0.0059 ms per query
Write transactions in SQLite are slow as you've noticed, because it asks OS to sync the file changes to disk. Disabling that with db.query("PRAGMA synchronous = OFF");
is also an option. This yields (no transaction used) [0.1398 ms per query]:
start :memory: w/ 10000 insertions
total 93 ms, 0.0093 ms per query
start test.db w/ 10000 insertions
total 1398 ms, 0.1398 ms per query
For reference this is perf on my machine with no changes (it's NVMe SSD, linux) [1.3012 ms per query]:
start :memory: w/ 10000 insertions
total 98 ms, 0.0098 ms per query
start test.db w/ 10000 insertions
total 13012 ms, 1.3012 ms per query
I feel like this issue can be closed. It seems that most if not all of the slowdowns observed here come down to how SQLite operates, not any problems with this Deno integration.
I think I agree with @FeldrinH, there is not much we can do to make os sync / Denos IO faster, and solutions exist eg using transactions / turning of sync if you are happy with that trade off.