better-sqlite3
better-sqlite3 copied to clipboard
[Discussion] large sqlite project structure
Hi all, I've been using better-sqlite3 for a few years on small-ish projects and libraries. In those instances, using the library doesn't require a lot of thought. It generally looks like:
- instantiate a database
- instantiate some statements
- start the application
However I want to take a stab at a large-ish web application using a sqlite backend. I find I'm bikeshedding a ton on the proper way to structure this project. The questions I find myself asking are:
- How should I go about typing (in typescript) my queries?
- Should I use model files?
- How do I handle migrations?
- How do I handle preparing statements that depend on migrations being ran?
The current iteration looks like the following:
- initialize better-sqlite3 database
- grab project version from a table that I know will exist for the project's lifetime
- if the current version > version in db, perform each migration in between
- initialize all the other statements
A model file essentially looks like this:
/* --============= Table Row Definitions =============-- */
interface BookTR {
id: number
title: string
published: boolean
published_date: Date
}
/* --================ Model Definition ================-- */
class BookModel extends Model {
private insert = this.register(InsertBookStatement)
private select_one = this.register(SelectOneBookStatement)
public create(book_data: Omit<BookTR, 'id'>) {
const book_id = this.insert(book_data)
return this.select_one(book_id)
}
}
/* --=================== Statements ===================-- */
class InsertBookStatement extends Statement {
sql = `INSERT INTO book (title, published, published_date) VALUES (@title, @published, @published_date)`
stmt = this.register(this.sql)
call(book_data: Omit<BookTR, 'id'>) {
const sql_data = {...book_data, published ? 1 : 0, published_date: book_data.published_date.toString()}
const info = this.stmt.ref.run(sql_data)
return info.lastInsertRowid
}
}
class SelectOneBookStatement extends Statement {
sql = `SELECT * FROM book WHERE id = ?`
stmt = this.register(this.sql)
call(book_id: BookTR['id']) {
return this.stmt.ref.get(book_id)
}
}
and the main database file looks like this:
class Database {
constructor(database_path: string) {
this.db = new BetterSqlite3(database_path)
}
init() {
for (const model of this.registered_models) model.init()
}
// model definitions
book = this.register(BookModel)
}
usage:
const db = new Database('sqlite.db')
db.init()
const book = db.book.create({ title: 'Oh The Places You'll Go', published: true, published_date: new Date('5/24/1997') })
essentially all those register methods do on Statements, Models and Database is instantiate the class with a reference to the better-sqlite3 database, and store registered objects in an array that is iterated on in an init method (what this really allows for is preparing all sql statements when the init method is called).
I am curious if anyone else has used better-sqlite3 for large projects. If you have, I would find it incredibly helpful to see some project structure, and how sql statements are managed.
Signal-Desktop uses a custom better-sqlite3 package with sqlcipher
https://github.com/signalapp/Signal-Desktop/blob/9ada9f6a479bcf690c55ba676799979a175f2ade/ts/sql/Server.ts#L412-L428
thanks for sharing! Its nice to see a real world ts project using handrolled sqlite statements. Unfortunately though, this is less than what I was hoping for... It looks like theyre just preparing all statements inline
async function getAllConversations(): Promise<Array<ConversationType>> {
const db = getInstance();
const rows: ConversationRows = db
.prepare<EmptyQuery>(
`
SELECT json, profileLastFetchedAt
FROM conversations
ORDER BY id ASC;
`
)
.all();
return rows.map(row => rowToConversation(row));
}
that simplifies their codebase and state management, but its not as performant
I will be publishing a wrapper for better-sqlite soon (by soon I mean probably in around a month or two), will post a link when it is published, but to be honest I also struggle.
Looks like TypeORM has a better-sqlite3 driver
I am curious if anyone else has used better-sqlite3 for large projects. If you have, I would find it incredibly helpful to see some project structure, and how sql statements are managed.
We are using Better-SQLite3 in company. Over hundred of tables, millions of rows and different databases for different versions. It is hard to handle sometimes because most of the statements are untrusted, user inputs from the company's customers. Some functions creates dynamic table names which they are user input and hard to handle prevent sql injections. (Using whitelist for table names etc). Project is in live system, it handles thousand of statements in second. We are actually very suprised that Better-SQLite3 is incredibly faster in larger project structures. Our biggest problem is fixing the particular bugs in million lines of codes. Because everytime we publish new update some of tables needs to change. And that changes are hard to maintain.
do you have type safety around your statements @Batur123?
Because everytime we publish new update some of tables needs to change. And that changes are hard to maintain.
What do migrations look like?
How do you structure your files?
I'm using better-sqlite3 for PhotoStructure.
I'm almost entirely using Knex.js for query building, to avoid SQL injection.
Knex.js has migrations support, but I haven't used it.
I wrote my own migrator for PhotoStructure, as I have code-based migrations as well that require custom db functions.
Here's a directory example of migrations (I have a different directory for each schema: I currently have 2 schemas).
https://github.com/photostructure/photostructure-for-servers/tree/main/migrations
The migrations table schema looks like this:
CREATE TABLE IF NOT EXISTS migrations (
id integer NOT NULL PRIMARY KEY,
name varchar(255) NOT NULL,
migration_time integer NOT NULL
)
The code to fetch filesystem migrations looks like this (lazy, thenMap and sortBy should be self explanatory):
readonly fsMigrations = lazy(() =>
thenMap(
this.migrationsDir.children(f => f.ext === ".sql"),
arr => sortBy(arr, ea => ea.base) // we must run them in order!
)
)
Code to see what migrations are currently applied to the db:
readonly migrationsInDatabase = lazy(() =>
this.db.prepare("SELECT name from migrations").pluck().all()
)
Have you seen this approach using pragma user_version?
It can get you migration capabilities during app startup that doesn't require a primordial table created out of band to store the migration state.
I think it's a taste issue--as all architecture decisions, go with whatever you feel is more reasonable or defensible to the Engineers Of Tomorrow (who are reviewing the code you write today).
(I prefer what I've written, fwiw: it's explicit and observable)
If anyone is curious, this is the approach I took for migrations https://github.com/andykais/forager/blob/main/src/db/migrations/index.ts, which is then called during initialization https://github.com/andykais/forager/blob/main/src/db/sqlite.ts#L39. The migrations themselves are entirely manual. E.g. I often end up creating temporary tables and what-have-you. It is very similar to the approach in the blog post @tphummel took, though I did not know about PRAGMA user_version. I just created my own table forager that I have committed to not changing during any point in the apps future. I may switch over to PRAGMA user_version though. That seems like a more normalized place to put a db version.
Possibly the one interesting smartness in here is that I have a test that migrates a database all the way from 0.1.0 to the current version, and then compares the resulting database schema to that of a freshly initialized database. I of course can do other regular test checks once the migration is complete, but this lets me have some sanity about migration success. The one dumb thing about this approach is I attempt to emulate old code from older versions in order to initialize my database as 0.1.0. I should instead just check in a database starting at whatever version I want to test a migration from.
I have a test that migrates a database all the way from 0.1.0 to the current version, and then compares the resulting database schema to that of a freshly initialized database.
Can confirm: my test suite that does similar stuff is great for ensuring upgrades are less painful for my users (as PhotoStructure is self-hosted, and upgrades are done automatically, behind the scenes).
I have a directory of db dumps from different test libraries built from different versions. My test suite pulls in the dump, then "restarts the app", verifies migrations are all applied, applies several queries (whose JSON results are compared against prior runs), and also verifies db operations (like upsert) still work.
My test suite pulls in the dump, then "restarts the app", verifies migrations are all applied, applies several queries (whose JSON results are compared against prior runs), and also verifies db operations (like upsert) still work.
oh, so you run the same test suite against multiple versions of the database? Do you have any tests specific to a particular old library, or do you just trust the blanket tests to cover everything? Do you have a link to some of that source code too? I am curious
you run the same test suite against multiple versions of the database? Do you have any tests specific to a particular old library?
Yes. Each versioned dump has different contents, and different expected query results (so v0.7-queryname.json)
or do you just trust the blanket tests to cover everything?
Most all of the other tests run against an empty database created by running all the migration scripts.
Hi y’all,
I found this conversation while searching for an issue and thought I’d share what I use, if you don’t mind me talking about a project that I developed 😬
It’s this: https://github.com/leafac/sqlite
It’s a wrapper around better-sqlite3 that answers some of the questions discussed here, for example, migrations (using user_version), how to manage the prepared statements, how to prevent SQL injection, and so forth.
It’s lean enough that I’m using in small projects, and it seems to be scaling well to relatively larger projects (approximately 30,000 lines of code and a thousand users).