vike icon indicating copy to clipboard operation
vike copied to clipboard

AI / LLMs 🤖

Open benallfree opened this issue 10 months ago • 31 comments

Description

After using Claude with Vike, I think it might be wise to have an llm.md that I can add to my cursor rules. There are some conventions in Vike that it doesn't seem to be inferring correctly, like to use +Head.tsx.

Is this something you'd be interested in? If so, I can help make the first pass at it.


Edit by @brillout:

See also

  • https://github.com/brillout/docpress/issues/63
  • https://github.com/brillout/docpress/pull/76

benallfree avatar Apr 05 '25 03:04 benallfree

Yes! I ain't that familiar with it; what would be the first step here?

brillout avatar Apr 05 '25 06:04 brillout

A good starting point would be to merge the docs into a single llm.txt or llm.md (recommended) so it can easily be consumed by LLMs to give additional context when working in Vike projects.

benallfree avatar Apr 05 '25 18:04 benallfree

👍

I guess concatening all docs/ files does the trick 👀

Labeling as contribution-welcome.

On Sat 5. Apr 2025 at 20:06, Ben Allfree @.***> wrote:

A good starting point would be to merge the docs into a single llm.txt or llm.md (recommended) so it can easily be consumed by LLMs to give additional context when working in Vike projects.

— Reply to this email directly, view it on GitHub https://github.com/vikejs/vike/issues/2349#issuecomment-2781022089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHVQRRAJ7IIV3YQCLY235L2YALSDAVCNFSM6AAAAAB2QAS3GSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOBRGAZDEMBYHE . You are receiving this because you commented.Message ID: @.***> [image: benallfree]benallfree left a comment (vikejs/vike#2349) https://github.com/vikejs/vike/issues/2349#issuecomment-2781022089

A good starting point would be to merge the docs into a single llm.txt or llm.md (recommended) so it can easily be consumed by LLMs to give additional context when working in Vike projects.

— Reply to this email directly, view it on GitHub https://github.com/vikejs/vike/issues/2349#issuecomment-2781022089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHVQRRAJ7IIV3YQCLY235L2YALSDAVCNFSM6AAAAAB2QAS3GSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOBRGAZDEMBYHE . You are receiving this because you commented.Message ID: @.***>

brillout avatar Apr 05 '25 18:04 brillout

On it

benallfree avatar Apr 05 '25 20:04 benallfree

I agree, providing a proper llm.txt would be very helpful, not only for developers but also for Vike as a project. LLMSO is becoming increasingly important as more and more people consult an LLM directly instead of googling. Providers like Cloudflare have already started to provide an llm.txt for every doc page. There are already semi-established standards for this.

CookedApps avatar May 14 '25 15:05 CookedApps

Thanks for sharing. It's on our radar, and we're increasingly considering it a high priority. I'll dig further into it after I'm done with my current priorities.

brillout avatar May 15 '25 09:05 brillout

Also: https://www.kapa.ai (Astro uses it).

brillout avatar Aug 26 '25 14:08 brillout

Solution: Create custom js script which will be executed after each build, to rebuild llms-full.txt file

P.S. if considered valuable i can try contribute into it.

Quoting @Avcharov.

Yes! We consider it highly valuable and contributions are very much welcome.

We actually started discussing this with @phonzammi at https://github.com/brillout/docpress/issues/63 — let's discuss implemention details over there.

I've a question (unrelated to the implementation): do you know (or someone else in this thread) how LLMs get to know about these llm.txt URLs? Is that something the user manually has to tell LLMs, or is it automatically discovered and considered when OpenAI/Claude/... crawls websites? Ideally, Vike's llm.txt files would automatically be integrated to agent loops.

brillout avatar Oct 24 '25 09:10 brillout

Usually this links are provided by user in instructions file or directly in prompt window, or we can setup Vike MCP server which will have be used by copilot to extract needed file. for example MCP can have command get_best_practice which copilot can call and receive llm.txt in response So for automatic integration i would use specific Vike MCP server

Avcharov avatar Oct 24 '25 09:10 Avcharov

An MCP server would be great. Can we somehow make it fully automatic (i.e. zero-config), so that the user doesn't have to do anything for LLMs to know about Vike's MCP server / Vike's llm.txt files?

brillout avatar Oct 24 '25 10:10 brillout

Hey, I tried creating this, but I don't know if it needs to look like this or something else. Because i saw in #2796 need for some kind of js script to scrap docs for each new build. If someone could check it out and correct me. Also if its good where I could upload it?

llms.txt

Jakubowsky97 avatar Nov 05 '25 11:11 Jakubowsky97

@Jakubowsky97 Correct me if I'm wrong, but I think llms.txt should contain the content of the pages, not merely links to vike.dev pages. The idea is to add Vike's documentation to the LLM's context, without the LLM having to crawl vike.dev. See for example https://svelte.dev/docs/llms. That's why, for now, such script is a step in the right direction.

brillout avatar Nov 05 '25 11:11 brillout

Llms txt is general links for content chapters, llms-full is for full content

Avcharov avatar Nov 05 '25 11:11 Avcharov

Llms txt is general links for content chapters, llms-full is for full content

Indeed. But isn't llms.txt a list of available llms.txt files (and not vike.dev links)? See for example https://svelte.dev/llms.txt.

brillout avatar Nov 05 '25 11:11 brillout

I was using this website https://llmstxthub.com/ to recreate it for Vike. Examples from that website: BetterAuth, Expo and so on. But as you mention I see many different docs using it like svelte did.

Jakubowsky97 avatar Nov 05 '25 12:11 Jakubowsky97

Even in https://llmstxt.org example and format is like this:

# Title

> Optional description goes here

Optional details go here

## Section name

- [Link title](https://link_url): Optional link details

## Optional

- [Link title](https://link_url)

Jakubowsky97 avatar Nov 05 '25 12:11 Jakubowsky97

How about we create:

  • [Vike team] A new page vike.dev/basics that covers all basics that both humans and AI should know — this file should always be loaded in LLMs' context.
  • [Contribution welcome] llms-full.txt that contains the entire documentation using the script — that's the "ultimate weapon" if the LLM has enough free context left.
  • [Vike team] llms-medium.txt — compressed documentation for use with medium context windows
  • [Vike team] llms-small.txt — highly compressed documentation for use with smaller context windows
  • [Vike team] llms.txt that is merely the "header" (should always be loaded in the LLM's context) and contains:
    • Link to vike.dev/basics + description "ALWAYS LOAD this URL"
    • Link to llms-full.txt + description "LOAD THIS this file which is 21MB big if you struggle using Vike (and, obviously, if your context has enough space to load this file)"
    • Links to llms-{small,medium}.txt + description
    • Links to all vike.dev pages + description "LOAD THIS URL if you think it's relevant for your current task and you didn't load llms-{full,medium}.txt"
  • [Contribution welcome] An MCP server (see discussion in this thread).

@everyone: WDYT?

(Actually, these are all contribution welcome once the Vike team solved this issue but let's simply start with the script for llms-full.txt — we'll re-use that script for the other files.)

brillout avatar Nov 05 '25 13:11 brillout

Okay, so i've tried to create that script, and got something like this:

llms-full.md scrape-local-docs.js

I keep it in root folder so vike/scrape-local-docs.js If someone could check if its good enough and llms-full should look like this?

Jakubowsky97 avatar Nov 07 '25 07:11 Jakubowsky97

@Jakubowsky97 Hi there is already open PR for this: https://github.com/brillout/docpress/pull/76

avcharovBetsson avatar Nov 07 '25 07:11 avcharovBetsson

I think it's pretty clear at this point that adding docs to the LLM's context — its "short term memory" — is the way forward. Adding some kind of header might be enough.

The generated llms-full.md is 733.2 KB big, which is currently too large to fit in the context window of LLMs.

I'm using Augment Code which is pretty good at deducing tool usage by looking at the repository's code (it does this automatically). Maybe we don't need to add that much docs to the LLM's context. How about this:

  • Header: links to all vike.dev pages + description "LOAD THIS URL if you think it's relevant for your current task"
  • Maybe also some unusual DX, e.g. Telefunc's event-based usage

@everyone What's your experience with your LLM? Does it generate valid Vike code? Is there something specific that your LLM repeatedly does wrong with Vike?

I suggest we discuss/dig/think about this a bit. I think we can make AI proficient with Vike ― and reliably so. In my experience, adding the right documentation of a given tool to the LLM's context makes the LLM proficient with that tool.

brillout avatar Nov 07 '25 11:11 brillout

👍 to the suggestion to have a links and a description at the top level llms.txt. This is nice because the agent can parse it and find whatever docs it thinks are relevant, however, the developer can learn the pattern to translate (human readable) docs URLs to the URL serving the markdown if they want to manually feed URLs into the context.

Similarly, having a "Copy Markdown" button directly on the docs can be incredibly helpful to give the LLM the context of the docs you have up on the side.

As for the mention of telefunc, I really think that should live at an llms.txt on telefunc.com. Vike is so flexible to all parts of the stack I don't think the vike-specific llms.txt should contain mention of telefunc. For example, I use trpc and have included instructions to use trpc in my CLAUDE.md. I think the agent might get confused if my CLAUDE.md is telling it to use trpc and Vike's llms.txt puts instructions about trpc into the context.

For Vike specific instructions, it's nothing really too interesting. I have specific instructions about Vike's hooks and how I use them: using +data in a SSR app, specific instructions on authorization checks with +guard.

One challenge I think you need to think about is that since Vike is so flexible there are many permutations of which docs are relevant to a particular agent session. I think it's going to be key to direct users to create a CLAUDE.md/AGENTS.md to add specific instructions about how they use Vike.

nickdichev avatar Nov 08 '25 00:11 nickdichev

however, the developer can learn the pattern to translate (human readable) docs URLs to the URL serving the markdown if they want to manually feed URLs into the context.

How about the user simply shares the URL of the documentation page to the LLM?

Similarly, having a "Copy Markdown" button directly on the docs can be incredibly helpful to give the LLM the context of the docs you have up on the side.

I don't see the point of providing an LLM version for each page. For example, at the bottom of https://svelte.dev/docs/kit/routing there is a link llms.txt. I don't understand the purpose of this.

In my experience, LLMs are fine with parsing HTML. If I add an URL to my prompt, my LLM automatically crawls the URL and extracts the content.

I really think that should live at an llms.txt on telefunc.com.

Agreed, this shouldn't be part of Vike's LLM context. I was just giving an example of unusual DX that LLM should know about. I wonder what kind of Vike usage is somewhat unusual that LLMs should know about.

I have specific instructions about Vike's hooks and how I use them: using +data in a SSR app

When I start using Augment Code for a new project, it looks at all patterns used inside the repository. So I wonder how much value such instructions bring, since these instructions already exists in the form of patterns used inside the repository. Maybe it's about further ensuring your LLM respect your patterns — if you omit these instructions, does your LLM's quality decrease noticeably?

specific instructions on authorization checks with +guard.

I see. I guess such instructions aren't easy for the LLM to automatically extract when analysing your repository (aka the "indexing phase").

One challenge I think you need to think about is that since Vike is so flexible there are many permutations of which docs are relevant to a particular agent session.

Good point. Maybe some(/most?) of it can be covered by LLMs analysing existing patterns in the repository. Maybe we can have some high-level architectural instructions, e.g. when using vike-react-query then data fetching happens on a component-level ― when you add a new loading component then always double check whether you introduce a waterfall (analyze the component tree).

brillout avatar Nov 08 '25 11:11 brillout

Speaking of AGENTS.md how about Vike generates AGENTS-vike.md while appending a line ALWAYS LOAD ./AGENTS-vike.md to AGENTS.md?

Do we need an MCP server then? I don't think so?

brillout avatar Nov 08 '25 11:11 brillout