better-sqlite3 icon indicating copy to clipboard operation
better-sqlite3 copied to clipboard

[Discussion] large sqlite project structure

Open andykais opened this issue 4 years ago • 14 comments
trafficstars

Hi all, I've been using better-sqlite3 for a few years on small-ish projects and libraries. In those instances, using the library doesn't require a lot of thought. It generally looks like:

  1. instantiate a database
  2. instantiate some statements
  3. start the application

However I want to take a stab at a large-ish web application using a sqlite backend. I find I'm bikeshedding a ton on the proper way to structure this project. The questions I find myself asking are:

  • How should I go about typing (in typescript) my queries?
  • Should I use model files?
  • How do I handle migrations?
    • How do I handle preparing statements that depend on migrations being ran?

The current iteration looks like the following:

  1. initialize better-sqlite3 database
  2. grab project version from a table that I know will exist for the project's lifetime
  3. if the current version > version in db, perform each migration in between
  4. initialize all the other statements

A model file essentially looks like this:

/* --============= Table Row Definitions =============-- */

interface BookTR {
  id: number
  title: string
  published: boolean
  published_date: Date
}

/* --================ Model Definition ================-- */

class BookModel extends Model {
  private insert = this.register(InsertBookStatement)
  private select_one = this.register(SelectOneBookStatement)

  public create(book_data: Omit<BookTR, 'id'>) {
    const book_id = this.insert(book_data)
    return this.select_one(book_id)
  }
}

/* --=================== Statements ===================-- */

class InsertBookStatement extends Statement {
  sql = `INSERT INTO book (title, published, published_date) VALUES (@title, @published, @published_date)`
  stmt = this.register(this.sql)

  call(book_data: Omit<BookTR, 'id'>) {
    const sql_data = {...book_data, published ? 1 : 0, published_date: book_data.published_date.toString()}
    const info = this.stmt.ref.run(sql_data)
    return info.lastInsertRowid
  }
}

class SelectOneBookStatement extends Statement {
  sql = `SELECT * FROM book WHERE id = ?`
  stmt = this.register(this.sql)

  call(book_id: BookTR['id']) {
    return this.stmt.ref.get(book_id)
  }
}

and the main database file looks like this:

class Database {
  constructor(database_path: string) {
    this.db = new BetterSqlite3(database_path)
  }
  init() {
    for (const model of this.registered_models) model.init()
  }

  // model  definitions
  book = this.register(BookModel)
}

usage:

const db = new Database('sqlite.db')
db.init()
const book = db.book.create({ title: 'Oh The Places You'll Go', published: true, published_date: new Date('5/24/1997') })

essentially all those register methods do on Statements, Models and Database is instantiate the class with a reference to the better-sqlite3 database, and store registered objects in an array that is iterated on in an init method (what this really allows for is preparing all sql statements when the init method is called).

andykais avatar Aug 04 '21 14:08 andykais

I am curious if anyone else has used better-sqlite3 for large projects. If you have, I would find it incredibly helpful to see some project structure, and how sql statements are managed.

andykais avatar Aug 04 '21 14:08 andykais

Signal-Desktop uses a custom better-sqlite3 package with sqlcipher

https://github.com/signalapp/Signal-Desktop/blob/9ada9f6a479bcf690c55ba676799979a175f2ade/ts/sql/Server.ts#L412-L428

vinaygopinath avatar Sep 08 '21 20:09 vinaygopinath

thanks for sharing! Its nice to see a real world ts project using handrolled sqlite statements. Unfortunately though, this is less than what I was hoping for... It looks like theyre just preparing all statements inline

async function getAllConversations(): Promise<Array<ConversationType>> {
  const db = getInstance();
  const rows: ConversationRows = db
    .prepare<EmptyQuery>(
      `
      SELECT json, profileLastFetchedAt
      FROM conversations
      ORDER BY id ASC;
      `
    )
    .all();

  return rows.map(row => rowToConversation(row));
}

that simplifies their codebase and state management, but its not as performant

andykais avatar Sep 09 '21 14:09 andykais

I will be publishing a wrapper for better-sqlite soon (by soon I mean probably in around a month or two), will post a link when it is published, but to be honest I also struggle.

mpmcintyre avatar Oct 25 '21 19:10 mpmcintyre

Looks like TypeORM has a better-sqlite3 driver

vinaygopinath avatar Oct 25 '21 19:10 vinaygopinath

I am curious if anyone else has used better-sqlite3 for large projects. If you have, I would find it incredibly helpful to see some project structure, and how sql statements are managed.

We are using Better-SQLite3 in company. Over hundred of tables, millions of rows and different databases for different versions. It is hard to handle sometimes because most of the statements are untrusted, user inputs from the company's customers. Some functions creates dynamic table names which they are user input and hard to handle prevent sql injections. (Using whitelist for table names etc). Project is in live system, it handles thousand of statements in second. We are actually very suprised that Better-SQLite3 is incredibly faster in larger project structures. Our biggest problem is fixing the particular bugs in million lines of codes. Because everytime we publish new update some of tables needs to change. And that changes are hard to maintain.

Batur123 avatar Oct 29 '21 11:10 Batur123

do you have type safety around your statements @Batur123?

Because everytime we publish new update some of tables needs to change. And that changes are hard to maintain.

What do migrations look like?

How do you structure your files?

andykais avatar Oct 29 '21 13:10 andykais

I'm using better-sqlite3 for PhotoStructure.

I'm almost entirely using Knex.js for query building, to avoid SQL injection.

Knex.js has migrations support, but I haven't used it.

I wrote my own migrator for PhotoStructure, as I have code-based migrations as well that require custom db functions.

Here's a directory example of migrations (I have a different directory for each schema: I currently have 2 schemas).

https://github.com/photostructure/photostructure-for-servers/tree/main/migrations

The migrations table schema looks like this:

CREATE TABLE IF NOT EXISTS migrations (
  id integer NOT NULL PRIMARY KEY, 
  name varchar(255) NOT NULL, 
  migration_time integer NOT NULL 
)

The code to fetch filesystem migrations looks like this (lazy, thenMap and sortBy should be self explanatory):

  readonly fsMigrations = lazy(() =>
    thenMap(
      this.migrationsDir.children(f => f.ext === ".sql"),
      arr => sortBy(arr, ea => ea.base) // we must run them in order!
    )
  )

Code to see what migrations are currently applied to the db:

  readonly migrationsInDatabase = lazy(() =>
    this.db.prepare("SELECT name from migrations").pluck().all()
  )

mceachen avatar May 15 '22 17:05 mceachen

Have you seen this approach using pragma user_version?

It can get you migration capabilities during app startup that doesn't require a primordial table created out of band to store the migration state.

tphummel avatar Jun 02 '22 18:06 tphummel

I think it's a taste issue--as all architecture decisions, go with whatever you feel is more reasonable or defensible to the Engineers Of Tomorrow (who are reviewing the code you write today).

(I prefer what I've written, fwiw: it's explicit and observable)

mceachen avatar Jun 02 '22 18:06 mceachen

If anyone is curious, this is the approach I took for migrations https://github.com/andykais/forager/blob/main/src/db/migrations/index.ts, which is then called during initialization https://github.com/andykais/forager/blob/main/src/db/sqlite.ts#L39. The migrations themselves are entirely manual. E.g. I often end up creating temporary tables and what-have-you. It is very similar to the approach in the blog post @tphummel took, though I did not know about PRAGMA user_version. I just created my own table forager that I have committed to not changing during any point in the apps future. I may switch over to PRAGMA user_version though. That seems like a more normalized place to put a db version.

Possibly the one interesting smartness in here is that I have a test that migrates a database all the way from 0.1.0 to the current version, and then compares the resulting database schema to that of a freshly initialized database. I of course can do other regular test checks once the migration is complete, but this lets me have some sanity about migration success. The one dumb thing about this approach is I attempt to emulate old code from older versions in order to initialize my database as 0.1.0. I should instead just check in a database starting at whatever version I want to test a migration from.

andykais avatar Jun 02 '22 19:06 andykais

I have a test that migrates a database all the way from 0.1.0 to the current version, and then compares the resulting database schema to that of a freshly initialized database.

Can confirm: my test suite that does similar stuff is great for ensuring upgrades are less painful for my users (as PhotoStructure is self-hosted, and upgrades are done automatically, behind the scenes).

I have a directory of db dumps from different test libraries built from different versions. My test suite pulls in the dump, then "restarts the app", verifies migrations are all applied, applies several queries (whose JSON results are compared against prior runs), and also verifies db operations (like upsert) still work.

mceachen avatar Jun 02 '22 20:06 mceachen

My test suite pulls in the dump, then "restarts the app", verifies migrations are all applied, applies several queries (whose JSON results are compared against prior runs), and also verifies db operations (like upsert) still work.

oh, so you run the same test suite against multiple versions of the database? Do you have any tests specific to a particular old library, or do you just trust the blanket tests to cover everything? Do you have a link to some of that source code too? I am curious

andykais avatar Jun 06 '22 18:06 andykais

you run the same test suite against multiple versions of the database? Do you have any tests specific to a particular old library?

Yes. Each versioned dump has different contents, and different expected query results (so v0.7-queryname.json)

or do you just trust the blanket tests to cover everything?

Most all of the other tests run against an empty database created by running all the migration scripts.

mceachen avatar Jun 06 '22 22:06 mceachen

Hi y’all,

I found this conversation while searching for an issue and thought I’d share what I use, if you don’t mind me talking about a project that I developed 😬

It’s this: https://github.com/leafac/sqlite

It’s a wrapper around better-sqlite3 that answers some of the questions discussed here, for example, migrations (using user_version), how to manage the prepared statements, how to prevent SQL injection, and so forth.

It’s lean enough that I’m using in small projects, and it seems to be scaling well to relatively larger projects (approximately 30,000 lines of code and a thousand users).

leafac avatar Nov 15 '22 15:11 leafac