content icon indicating copy to clipboard operation
content copied to clipboard

Hitting SQLITE_TOOBIG when trying to deploy via nuxthub

Open acidjazz opened this issue 7 months ago • 3 comments

I think this issue belongs here possibly

error i'm getting:

Run nuxt-hub/action@v1
Processing database migrations...
Found 0 applied database migrations
Applying database migration 0000_purple_omega_flight...
Applied database migration 0000_purple_omega_flight
1 database migrations applied
Applying 5 database queries...
Applying database query content-database-001...
Error: Failed to query database: statement too long: SQLITE_TOOBIG
Error: Failed to apply database query content-database-001: Failed to query database: statement too long: SQLITE_TOOBIG

acidjazz avatar May 17 '25 17:05 acidjazz

update: SQLITE_TOOBIG is a sqlite error - not a D1 limit

tried some PR's to attempt to change SLICE_SIZE in src/utils/collection.ts to a smaller size, even at 20k I'm still getting this error.

while I still get the error for adjusting that parameter, I'm seeing more inserts happen in one of the content tables as the deploy is attempted. not sure if that means much, or if i'm in the right place when attempting to fix this.

here's a bit more context on my content/:

total of 321 markdown files, here are some folder sizes

❯ du -sh *
336K	advisories
1.3M	blog
136K	events
 88K	news
132K	pages
176K	press

here are some of the largest sizes in blog/

❯ du -sh * | sort -hr
 36K	xiongmai-iot-exploitation.md
 36K	confluence-dreams-of-shells.md
 36K	bigant-cve-2025-0364.md
 28K	new-cve-2022-1388.md
 24K	zyxel-telnet-vulns.md
 24K	openfire-cve-2023-32315.md
 24K	moobot-uses-fake-vulnerability.md
 24K	mikrotik-foisted-revisited.md
 24K	2022-missing-kev-report.md

acidjazz avatar May 18 '25 06:05 acidjazz

Update: I found the culprit that is triggering this d1 error and will be able to whip up a reproduction when I get more time - should help target what needs to be tweaked - essentially its heavy metadata in content config.

acidjazz avatar May 19 '25 18:05 acidjazz

Update: I think this is happening when we use rawbody https://content.nuxt.com/docs/advanced/raw-content

acidjazz avatar May 19 '25 19:05 acidjazz

@acidjazz hey, do you have an idea of how to circumvent this issue? I have a site with lots of content when I started migrating and added the first 1000 content entries I started getting this error. I don't use raw content.

MauPalantir avatar Jul 17 '25 21:07 MauPalantir

@acidjazz hey, do you have an idea of how to circumvent this issue? I have a site with lots of content when I started migrating and added the first 1000 content entries I started getting this error. I don't use raw content.

I do not currently.. we just stopped using raw content and it went away.. maybe take a look at your frontmatter and see how big it its?

acidjazz avatar Jul 18 '25 00:07 acidjazz

Unfortunately I didn't have rawbody anywhere in my content and frontmatter is only some simple fields. but not sure if it matters, since the body seems to be inserted into sql as well... Maybe because we use a lot of non-ascii characters? Would that alter byte size maybe of the chunks? I don't know.

If I manually set the slice size to a very small value, like, 7000 instead of the original 70000 for example, it went away. I wonder if we could just make this value configurable.

@acidjazz maybe we could resume your work along this line, not sure how far you got.

MauPalantir avatar Aug 08 '25 17:08 MauPalantir

I'm currently running into this problem as well and comparing the different pages, one with seemingly far shorter content but some non-ascii characters, and one with just a lot of pure text. The shorter one fails Would really be amazing if the SLICE_SIZE could be configurable.

TheAnachronism avatar Aug 14 '25 21:08 TheAnachronism

OK, so after some further experimentation, theoretically the generated INSERT statements could be updated to instead use concatenate UPDATE statements So that

INSERT INTO _content_asdf (..., body, ...)
VALUES (..., body, ...)

becomes

INSERT INTO _content_asdf (..., body, ...)
VALUES (..., '', ...)

With the split up UPDATE statements:

UPDATE _content_asdf
SET body = body || '<chunk>
WHERE id = '<id>'

This way, no single SQL statement is too big, and the entire Unicode body can be properly imported into the DB.

TheAnachronism avatar Aug 14 '25 21:08 TheAnachronism

OK, so after some further experimentation, theoretically the generated INSERT statements could be updated to instead use concatenate UPDATE statements So that

INSERT INTO _content_asdf (..., body, ...) VALUES (..., body, ...) becomes

INSERT INTO _content_asdf (..., body, ...) VALUES (..., '', ...) With the split up UPDATE statements:

UPDATE _content_asdf SET body = body || ' WHERE id = '' This way, no single SQL statement is too big, and the entire Unicode body can be properly imported into the DB.

I believe the inserts are chunked? and I think I messed with the size of each insert to try and solve my issue with no luck

acidjazz avatar Aug 15 '25 15:08 acidjazz

OK, so after some further experimentation, theoretically the generated INSERT statements could be updated to instead use concatenate UPDATE statements So that INSERT INTO _content_asdf (..., body, ...) VALUES (..., body, ...) becomes INSERT INTO _content_asdf (..., body, ...) VALUES (..., '', ...) With the split up UPDATE statements: UPDATE _content_asdf SET body = body || ' WHERE id = '' This way, no single SQL statement is too big, and the entire Unicode body can be properly imported into the DB.

The generated sql seems to follow a similar logic already, it just seems to still generate too big slices sometime. I would think it's the unicode characters, though in Hungarian, only about 15% of the characters are nonüascii, so I wonder what's actually happening.

MauPalantir avatar Aug 16 '25 11:08 MauPalantir

Is there any updates here ?

i also have same issue with around 200 article in arabic

$ du -sh *
1.3M    app
568K    bun.lock
1.0K    commitlint.config.cjs
1.0K    components.json
1.3M    content
4.0K    content.config.ts
19M     dist
1.0K    eslint.config.mjs
48K     i18n
13K     lib
4.0K    LICENSE

lord007tn avatar Oct 12 '25 01:10 lord007tn

@lord007tn Yes, I was able to circumvent the issue with lowering the thresholds manually for the chunk generation MAX_QUERY_SIZE and SLICE_SIZE. I think I had to lower both substantially, in my case, QUERY_SIZE was lowered to 50000 from 100000, and SLICE_SIZE to 10000 from 70000, but maybe that could be up a bit, IDK.

Maybe we should either experiement with different sizes and lower it so people with non-ascii text don't run into this issue or leave the default and make it configurable.

https://github.com/nuxt/content/blob/2d21c0288c8610c7790dab3508417d423b3962de/src/utils/collection.ts#L119

MauPalantir avatar Oct 13 '25 09:10 MauPalantir

with new update in v3.8.0 and i believe this pr solved it https://github.com/nuxt/content/pull/3569

am not facing that issue, until i test it with larger set of articles in arabic and english.

lord007tn avatar Oct 29 '25 23:10 lord007tn