website AI improvements

At the moment, LLMs are very dummy and not capable of generating correct AsyncAPI v3 documents.

My observation is that they internally generate v2 and call such documents v3, sometimes try some conversion, but the output is bad.

We need some basic quality improvements to make sure LLMs get proper context about AsyncAPI:

make sure on each build we generate and expose llms-full.txt and llms.txt
make sure it is added to https://llmstxt.site/ and https://directory.llmstxt.cloud/
we have https://www.asyncapi.com/rss.xml but we need to explicitly also generate sitemap.xml (rss.xml is explicitly configured for Google Search, but Google Search is not the only consumer we need to satisfy) and llms.txt (LLM sitemap-like)

Last but not least, you need to explore:

what approach should be applied to https://v2.asyncapi.com/. Do we change anything
probably makes sense also to expose json schema? it is already in the website repo: https://github.com/asyncapi/website/blob/master/config/3.0.0.json

Above is just rough basics described by me. If you know any other latest-known techniques to improve AI-experience even more, please share.

Sep 10 '25 12:09 derberg

Yeah, or even an AsyncAPI MCP for Validation, like LikeC4 also have now. or Serna can gather such Knowlegdg with the help of LanguageServer. or other Tooling

Sep 10 '25 12:09 hwinkel

most easy part: make sure the Context7 Index is correctly feed witht the intended infos, nearly all AI Agents have Context7 MCP configured to lookup library docs. https://context7.com/asyncapi/spec

Sep 10 '25 12:09 hwinkel

@derberg, how can I contribute to improving the broken LLM?

Sep 13 '25 07:09 dishafaujdar

@derberg @hwinkel

My approach to fix LLMs:

LLMs keep mixing v2 + v3 because they don’t have a clean map of our docs.

Plan: Give them proper files.

Steps I’ll do:

Auto-generate sitemap.xml → so all search engines (not just Google) see the whole site.
Create llms-full.txt → big list of all important links (specs, schemas, tutorials, blog posts).
Create llms.txt → small index that points to llms-full.txt, sitemap, rss, schemas.
Hook the generation of these files into the site build (so they’re always fresh).
Register llms.txt on llmstxt.site + directory.llmstxt.cloud

Result: Search engines and LLMs stop hallucinating broken documents and always find the official v3 content.

If there's anything else that needs to be done, lemme know. Also, I wanna ask, can I start contributing?

Sep 13 '25 11:09 dishafaujdar

@dishafaujdar looks good, most critical: don't do all in single PR. Step by step please. The smaller PR, the easier to review and merge. Order doesn't matter.

Sep 13 '25 11:09 derberg

@derberg Gotcha!

Sep 13 '25 12:09 dishafaujdar

Make a seperate pr for the llms-full.txt generation. LLMs behave a certain way for diff context inputs. It will improve eventually. We can start adding

adding JSON Schema validation examples directly include explicit v2→v3 migration examples add complete working examples for common patterns (pub/sub, request/reply) consider adding anti-examples showing common mistakes LLMs make

also we can make a seperate pr to Structure the documentation to work better with RAG systems Add semantic metadata to help LLMs retrieve the most relevant sections

for search tool calling

Sep 13 '25 13:09 Manancode

@Manancode Sure, I'll keep that in mind.

Sep 13 '25 14:09 dishafaujdar

@derberg can i work on this or this is completed ?

Dec 10 '25 18:12 Sourya07

@Sourya07 yes, go ahead, I just closed https://github.com/asyncapi/website/pull/4416 as here was no progress. When you will work on that PR, look at https://github.com/asyncapi/website/pull/4416 too and take into account the discussion that was there

Dec 11 '25 16:12 derberg