Custom sources
Astro's content layer provides an abstraction for loading data from social media, alternative data formats, headless CMS. It is not limited to disk anymore.
Do you think something similar is within scope of this project? If so, would you consider community contribution/collaboration towards the feature?
Yes, I've already thought about it. My plan is to create a "source" abstraction that returns the documents of a collection along with a watcher. The implementation of the source could be file system-based or something different.
I would like to export another function from core that allows the creation of such a source, called defineSource. The result could then be passed to the collection. This could be used to create packages for specific sources, such as headless CMS or remote Git repositories (#372).
I would love to see community contributions for this, but it is a non-trivial change. Getting the types correct could be quite challenging and likely involves a breaking change for the configuration.
I don't have much time to implement it myself right now, but I'm here to support you in any way I can if you're willing.
Yes, I've already thought about it. My plan is to create a "source" abstraction that returns the documents of a collection along with a watcher. The implementation of the source could be file system-based or something different.
I would like to export another function from core that allows the creation of such a source, called
defineSource. The result could then be passed to the collection. This could be used to create packages for specific sources, such as headless CMS or remote Git repositories (#372).I would love to see community contributions for this, but it is a non-trivial change. Getting the types correct could be quite challenging and likely involves a breaking change for the configuration.
I don't have much time to implement it myself right now, but I'm here to support you in any way I can if you're willing.
This would be cool! I wish I had the time to attempt implementation 😅
I've started working on this feature (branch feature/471_custom_source). This change is challenging, and much of the internal code needs to be rewritten.
I think this feature will take some time to complete, especially since I will be away on vacation several times in the near future. However, when I am back home, I will finish it.
Okay, with my third attempt, I have now completed the API and the types. The definition of a normal filesystem-based collection will look like this:
const posts = defineCollection({
name: "posts",
source: {
directory: "src/posts",
include: "**/*.md(x)?"
},
schema: z.object({
title: z.string(),
}),
});
The old way (directory, include, exclude and parser properties defined on the top level of the collection), will still work but is deprecated.
Defining a collection with a custom source will look like this:
const source = defineSource(() => ({
documents: () => Promise.resolve([
{
data: { title: "First Post", content: "Content of the first post" },
_meta: {
id: "1",
type: "file",
createdAt: 1750190278,
},
},
]),
documentsHaveContent: true
}));
const posts = defineCollection({
name: "posts",
source,
schema: z.object({
title: z.string(),
}),
});
The types for _meta are inferred and if the documentsHaveContent returns true an implicit content property is added to the resulting type of the collection.
A source can also extend the context of the transform function e.g.:
const source = defineSource(() => ({
documents: () => Promise.resolve([]),
extendContext: (document) => ({
computeSid: () => `sid-${document._meta.id}`,
}),
}));
const posts = defineCollection({
name: "posts",
source,
schema: z.object({
title: z.string(),
}),
transform: (doc, ctx) => {
return {
...doc,
sid: ctx.computeSid(),
};
},
});
The implicit content property has caused me a lot of frustration, and many type helpers exist solely to infer it correctly. With version 1.0.0, I will likely remove this behavior. It involves a significant amount of poorly maintainable code for a questionable feature. Starting from version 1.0.0, users will need to explicitly define content in the schema.
It will take some time to complete the entire feature. There is still much to do, and I will soon be on vacation.