libsql icon indicating copy to clipboard operation
libsql copied to clipboard

ERROR libsql::replication: replicator sync | database disk image is malformed

Open Aft1n opened this issue 1 year ago • 16 comments

I have an app deployed in production that has a replica db saved in the app folder, and today i checked the logs and saw this error:

ERROR libsql::replication: replicator sync error: replication error: Injector error: SQLite error: database disk image is malformed

Everything was working perfectly fine for couple of weeks, no changes to db schema were made, it just simply started to show this error in logs

Aft1n avatar Jul 25 '24 16:07 Aft1n

Somehow my replica db got corrupted, after resyncing replica db locally and pushing it back to the server it got fixed.. What could have caused the problem?? Multiple user writes or something else?

Aft1n avatar Jul 25 '24 17:07 Aft1n

Sometimes embedded replicas get corrupted when their db file is opened with regular sqlite3 driver. Maybe you did something like that?

@LucioFranco Would you be able to help please?

haaawk avatar Jul 25 '24 18:07 haaawk

Considering this file was deployed in production, i didn't touch it. Though i had some active traffic to the website before noticing this error, though it was mostly reads, and at first i thought that it was some kind of attack, like sql injection( im not very proficient in those issues)

But could multiple simultaneous writes and potential error corrupt it??

Aft1n avatar Jul 25 '24 18:07 Aft1n

It is not possible to write directly to embedded replicas. The writes are being forwarded to a primary in the cloud and then fetched back with sync. So multiple writers should be fine I think

haaawk avatar Jul 25 '24 18:07 haaawk

Yeah, i know that. But if there was some race condition or some errors during writes, maybe this could potentially on sync corrupt replica dbs. Interesting thing was, that i have a full-stack app that has replica. And also a separate api that also has its own replica. And even though some actions were happening on the full-stack app, inside my api application DB was corrupted as well..

So basically there was something going on in turso parent db, and this error was synced to both replicas. But it worked totally fine with turso dashboard, and after deleting replicas and syncing again it got fixed.

Aft1n avatar Jul 25 '24 19:07 Aft1n

@LucioFranco I think you should investigate that as part of your embedded replicas work

haaawk avatar Jul 26 '24 08:07 haaawk

@Aft1n could you share the version of libsql that you were using?

LucioFranco avatar Jul 29 '24 15:07 LucioFranco

i was using the latest - 7.0

Aft1n avatar Jul 29 '24 16:07 Aft1n

I think your theory sounds correct, I would say if you start to see this issue again you can ping me on discord and I can take a look at what is going on it. This is slightly hard to debug since the malformed error is quite cryptic.

LucioFranco avatar Jul 30 '24 14:07 LucioFranco

So this error happened again, heres an abrupt error from logs:

22:56:11 1|main | error: replication error: Injector error: SQLite error: database disk image is malformed 22:56:11 1|main | at new Database (/app/node_modules/libsql/index.js:75:17) 22:56:11 1|main | at _createClient (/app/node_modules/@libsql/client/lib-esm/sqlite3.js:39:16) 22:56:11 1|main | at /app/src/db/index.ts:7:28 22:56:11 1|main | Bun v1.1.9 (Linux x64)

22:54:47 1|main | 2024-08-02T22:54:47.369381Z ERROR libsql::replication: replicator sync error: replication error: Injector error: SQLite error: database disk image is malformed 22:54:47 0|main | 2024-08-02T22:54:47.376994Z ERROR libsql::replication: replicator sync error: replication error: Injector error: SQLite error: database disk image is malformed

And i have noticed that my Turso sync usage skyrocketed, because it was having an error and couldnt go through, possibly its currently 5gb/2gb

Aft1n avatar Aug 02 '24 23:08 Aft1n

As well as found this one in my fullstack app, that basically is the source for all changes to be sent to turso from.. And this a cron task that failed i assume, or something happend to close the connection. Maybe this could affect it?

error logs

Aft1n avatar Aug 02 '24 23:08 Aft1n

having the same issue with a deployement on railway

vanillacode314 avatar Sep 14 '24 19:09 vanillacode314

@Aft1n @vanillacode314 are either of you still experiencing this issue? If so could you try to give me a small example of how you're able to trigger it. Unfortunately, just the error message is not very helpful and doesn't really tell us why it was malformed and I am unable to reproduce this.

LucioFranco avatar Oct 21 '24 21:10 LucioFranco

In my case its pretty random, i have 2 instances of my app deployed on different servers, and as you can see from the screenshot one works perfectly fine, and another throws this error. Sometimes both produce this error. I need to delete turso replica files in my project and redeploy my app to make it go away.

I have updated to the latest version of libsql - "@libsql/client": "^0.14.0" but the issue still persists. And if im not paying attention my turso Embedded syncs sky rocket, currently at 10gb/3gb Знімок екрана 2024-10-30 о 10 57 28

Aft1n avatar Oct 30 '24 09:10 Aft1n

@LucioFranco I just experienced this in my dev setup, when trying to ingest a lot of data in a loop. So basically just doing a lot of sequential inserts. I'm using "drizzle-orm": "^0.38.3".

I think it might be related to the syncInterval: 60 setting, removing it seems to help, now it failed with:

2024-12-26T16:58:48.147028Z ERROR tower_http::trace::on_failure: response failed classification=Error: status: Internal, message: "Invalid header bit 45 expected 0 or 1", details: [], metadata: MetadataMap { headers: {} } latency=0 ms
Error: Replication(Client(Status { code: Internal, message: "Invalid header bit 45 expected 0 or 1", source: None }))
    at Statement.run (/Users/oscar/dev/priv/brf-imd/brf-imd/node_modules/libsql/index.js:296:29)
    at executeStmt (/Users/oscar/dev/priv/brf-imd/brf-imd/node_modules/@libsql/client/lib-cjs/sqlite3.js:294:34)
    at Sqlite3Client.execute (/Users/oscar/dev/priv/brf-imd/brf-imd/node_modules/@libsql/client/lib-cjs/sqlite3.js:101:16)
    at LibSQLPreparedQuery.run (webpack-internal:///(action-browser)/./node_modules/drizzle-orm/libsql/session.js:128:58)
    at QueryPromise.run (webpack-internal:///(action-browser)/./node_modules/drizzle-orm/sqlite-core/query-builders/insert.js:164:28)
    at QueryPromise.execute (webpack-internal:///(action-browser)/./node_modules/drizzle-orm/sqlite-core/query-builders/insert.js:176:54)
    at QueryPromise.then (webpack-internal:///(action-browser)/./node_modules/drizzle-orm/query-promise.js:26:17)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5) {
  code: ''
}

oscar-b avatar Dec 26 '24 17:12 oscar-b

I'm experiencing similar issue. Has the problem been fixed?

intaek-h avatar Mar 30 '25 10:03 intaek-h