Migration challenges with SQLite
Environment
- Operating System:
Darwin - Node Version:
v22.13.1 - Nuxt Version:
3.15.4 - CLI Version:
3.22.5 - Nitro Version:
2.11.5 - Package Manager:
[email protected] - Builder:
- - User Config:
content,ignore,router,app,scripts,devtools,modules,i18n,image,css,nitro,hooks,site,sitemap,linkChecker,schemaOrg,vite,compatibilityDate - Runtime Modules:
@nuxtjs/[email protected],@nuxt/[email protected],@nuxt/[email protected],@nuxt/[email protected],@nuxt/[email protected],@nuxt/[email protected],@nuxtjs/[email protected] - Build Modules:
-
Version
v3
Reproduction
Can only reproduce with specific hardware
Description
Hi all, we're trying to migrate from Content v2 to Content v3 and have noticed some significant challenges with SQLite and our GitLab architecture. Issues we're facing:
- Linux x86-64 runner (32 vCPU's, 128GB memory, and 200GB storage) takes 25% longer to nuxt generate compared to v2 on the same hardware. We believe the issue is something to do with SQLite. Originally the builds were significantly longer, but changing journal mode to WAL helped shave off 20%. When switching to a much less powerful macOS (6 vCPU's, 16GB memory, 50GB storage) runner, we're seeing build times match Content v2.
- SQLite errors with larger data. My content directory has a single test.md file that is just the word "test" copy pasted 8000 times. The limit appears to be around 51,980 for yml and 32,875 for md. This is for json strings in a single column, if the content is separated into more columns this error goes away.
yarn dev
yarn run v1.22.22
$ nuxt dev
Nuxt 3.15.4 with Nitro 2.10.4
➜ Local: http://localhost:3000/
➜ Network: use --host to expose
➜ DevTools: press Shift + Option + D in the browser (v1.7.0)
✔ Processed 2 collections and 1 files in 60.52ms (0 cached, 1 parsed)
ERROR [unhandledRejection] near "yml": syntax error
at Database.exec (node_modules/better-sqlite3/lib/methods/wrappers.js:9:14)
at Object.exec (node_modules/db0/connectors/better-sqlite3.mjs:27:22)
at Object.exec (node_modules/@nuxt/content/dist/module.mjs:369:10)
at processCollectionItems (node_modules/@nuxt/content/dist/module.mjs:2521:8)
at async node_modules/@nuxt/content/dist/module.mjs:2420:20
at async initNuxt (node_modules/nuxt/dist/shared/nuxt.CrJjphBv.mjs:5343:3)
at async NuxtDevServer._load (node_modules/@nuxt/cli/dist/chunks/dev2.mjs:162:5)
at async NuxtDevServer.load (node_modules/@nuxt/cli/dist/chunks/dev2.mjs:94:7)
at async NuxtDevServer.init (node_modules/@nuxt/cli/dist/chunks/dev2.mjs:89:5)
at async Object.run (node_modules/@nuxt/cli/dist/chunks/dev-child.mjs:83:5)
content.config.ts
export default defineContentConfig({
collections: {
content: defineCollection({
type: 'page',
source: '**',
}),
},
});
- It's mentioned in the migration documentation that you don't need to modify your content files. However we've had to modify our yml files to efficiently query nested content. Our blog posts for example were in .yml files with 2 objects at the root, "seo" and "content". The content object is getting dropped into the content column of the table as a text field. This makes is significantly slower/harder to query than before as our blog posts can be massive. We've been taking queryable fields out of our content column and creating new columns to index. Is there anything we're missing here?
Overall we're really excited to use Content v3, but have a few roadblocks, please let me know if there's any additional information I can provide!
Additional context
Content v2 to Content v3 Merge Request - https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/merge_requests/399
Hello @NathanDubord, Thanks for the issue description, I have some idea about the source of these errors, but some parts are surprising to me.
Linux x86-64 runner (32 vCPU's, 128GB memory, and 200GB storage) takes 25% longer to nuxt generate compared to v2 on the same hardware. We believe the issue is something to do with SQLite. Originally the builds were significantly longer, but changing journal mode to WAL helped shave off 20%. When switching to a much less powerful macOS (6 vCPU's, 16GB memory, 50GB storage) runner, we're seeing build times match Content v2.
This difference might be related to the SQLite connector we are using. There is an open PR #3230 to adopt built-in adapter which is introduced in Node 22.5+. As I checked, you are using Node 22.11, so if you check with the pr release in both of the runners we can find out if it is related to the SQLite adapter or not.
npm i https://pkg.pr.new/@nuxt/content@3230
Another improvement would be to introduce a flag to disable development cache to improve build speed. This cache database improves performance in development, which prevents unnecessary parses. But in CI it is always empty and checking and filling is not necessary.
https://github.com/nuxt/content/blob/75121d5bf408c050a6544f5a005c19419f280bee/src/module.ts#L307-L309
SQLite errors with larger data. My content directory has a single test.md file that is just the word "test" copy pasted 8000 times. The limit appears to be around 51,980 for yml and 32,875 for md. This is for json strings in a single column, if the content is separated into more columns this error goes away.
This is most likely related to our query generator and splitter. Could you share .nuxt/content/database.sql?
It's mentioned in the migration documentation that you don't need to modify your content files. However we've had to modify our yml files to efficiently query nested content. Our blog posts for example were in .yml files with 2 objects at the root, "seo" and "content". The content object is getting dropped into the content column of the table as a text field. This makes is significantly slower/harder to query than before as our blog posts can be massive. We've been taking queryable fields out of our content column and creating new columns to index. Is there anything we're missing here?
Sorry to hear that, as you explained and what I get from your repository, you have a unique way to write blog posts in yaml files and you use markdown-it to parse and render in runtime. You can still keep the contents structure as is and just modify them just for Content module and querying purpose. using hooks https://content.nuxt.com/docs/advanced/hooks#contentfileafterparse
// nuxt.config.ts
export default defineNuxtConfig({
hooks: {
'content:file:afterParse'({ content }) {
// copy category out of `content` field to store it in separate column and easy filtering
content.category = content.content.category
}
}
})
Also, I am wondering what stopped you from using markdown files and providing SEO data in front-matter and use ContentRenderer to render parsed data? like:
---
title: ....
category: ...
seo:
title: ...
description: ...
---
# Markdown
Hello @farnabaz
I have the same issue here, but in development and using markdown files.
It seems that the error is thrown for large files, while processCollectionItems run the insert into queries: https://github.com/nuxt/content/blob/main/src/module.ts#L362
In my example, the error says
ERROR [unhandledRejection] near "["TairoTable","TairoTableHeading","TairoTableRow","TairoTableCell"]": syntax error
Here is the generated sql that thrown the error: https://pastebin.com/g2Jngw8G For info, the original query size is 70740 (69.08 KB)
Initially I thought it came from my metadata, but removing them raise other errors, while reducing content size make the error disappear
---
title: TairoTable
components:
- TairoTable
- TairoTableHeading
- TairoTableRow
- TairoTableCell
---
This is really strange, we are far from sqlite limites: https://www.sqlite.org/limits.html
Another point is that the errors are not caught, and nuxt crash/restart continuously
note: I've tried experimental sqlite, but this doesn't change anything
note 2: The issues do not occur with content v3.1.0, and start with content v3.2.0
Hey @stafyniaksacha, From the generated SQL, I can say that the query generator is wrong, and it does not escape newlines (check line 7, 8, 9). But debugging blindly is not ideal, I would appreciate it if you could share your collection and content file? Ideally, a simple repository to reproduce the issue.
Yep, I see those newline, but If I remove the components from the yml, then newlines are added in other place.
Here is the example of new sql output: https://pastebin.com/06x2aEdb
And here is the schema:
const schema = z.object({
// components: z.string().array(),
toc: z.boolean(),
icon: z.object({
src: z.string(),
srcDark: z.string().optional(),
}),
})
export default defineContentConfig({
collections: {
docs: defineCollection({
source: 'documentation/**',
type: 'page',
schema,
}),
},
})
Could you provide raw markdown?
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.