🧹 Help us clean up duplicate (or unnecessary) libraries in Context7
Hey everyone,
We need your help to clean up libraries in Context7.
While searching for a library, if you come across multiple instances of the same one, please report them under this issue. Include a screenshot or a short note about what you searched for and what you found, and we’ll clean up the duplicates.
Besides duplicates, you can also suggest libraries you think should be deleted—just let us know which ones and why.
Thanks in advance!
A few questions about the quest:
Garbage Collection: Has the project considered automated removal based on usage metrics? Libraries unused/not-queried for 60+ days could be auto-removed since adding them back is trivial. This avoids manual bias and prevents "popularity contest" dynamics where newer libraries get unfairly targeted or someone just doesn't like an approach.
Duplicate Detection: I've tried submitting duplicates before and the system blocked me, so some detection exists. If not already implemented, hashing submissions and checking against existing github url would be straightforward and inexpensive to prevent duplicates upfront.
The "suggest libraries you think should be deleted" approach is a perfect example of the popularity contest issue - manual curation risks favoring established libraries over promising newcomers. Usage-based cleanup seems more objective and scalable.
Yep we already check for duplicates, we will think about usage based cleanup thanks for the feedback!
https://context7.com/context7/docs_astral_sh-uv, https://context7.com/astral-sh/setup-uv and https://context7.com/llmstxt/docs_astral_sh-uv-llms.txt are duplicates of https://context7.com/astral-sh/uv
https://context7.com/llmstxt/staging_bryntum-products-calendar-llms.txt is a duplicate and should be deleted, it's based on staging server
Actual library: https://context7.com/llmstxt/bryntum-products-calendar-llms.txt
Hey @cobrabr are you sure if setup-uv is the same as uv?
This is somewhat an adjacent issue: I use TailwindCSS extensively, and the references often jump between v3 and v4 links. Even in v4 itself, the LLM pulls the wrong topic. Today's example was perfect. We wanted to test the TailwindCSS typography plugin, but the topic "typography plugin @plugin directive v4" pulling from "context7/tailwindcss" instead landed upon "Load Legacy Tailwind Plugin with @plugin directive". There are clearly more relevant items in the entire library: https://context7.com/context7/tailwindcss But perhaps a library with 500k tokens bears a higher risk of misdirection? I can't even call this hallucination. It's clearly searched and landed on the wrong topic.
hey Tenelia. The snippet about Legacy Tailwind Plugin exist in all docs resources including docs repo. I dont know how to filter it out. I hope AI agent would be smart enough to not to use a code flagged legacy.
https://github.com/tailwindlabs/tailwindcss.com/blob/main/src/docs/functions-and-directives.mdx#_snippet_11
It doesn't seem feasible because it still gets picked up in the wrong way. A misdirection causes the AI to thread the needle wrong, leading to a cascading series of missing imports, syntax issues, etc. Are you able to grep and remove sections? As things scale, you may need to have batch jobs setup for each lib.
Hey @cobrabr are you sure if setup-uv is the same as uv?
No, it's not, you're right. My bad, good catch.
You really need to add a voting system on the site, that way we downvote and flag stuff quickly, get enough votes and flags and it triggers a review.
In relation to this if the repo for a docs site was made public would that be a preferable method for adding docs to this tool?
and if so how would they be able to go about swapping context7 over from the website to the markdown repo?
You can add both versions and make LLM pick the one you like, but repos are the recommended method
Remove /bevyengine/bevy in favor of docs.rs/bevy/latest
Why, does it have anything missing?
https://context7.com/mui/material-ui-docs is a duplicate of https://context7.com/mui/material-ui.
I think we can argue that https://context7.com/mui/material-ui-docs is the one to remove. It looks like https://context7.com/mui/material-ui correctly identifies releases (it doesn't blindly take HEAD as the most recent version).
@enesgules
Why, does it have anything missing?
If you were asking me, yes it does, because a lot of bevy documentation is generated using cargo doc which uses comments from code which context7 doesn't read.
context7.com/mui/material-ui-docs is a duplicate of context7.com/mui/material-ui.
I think we can argue that context7.com/mui/material-ui-docs is the one to remove. It looks like context7.com/mui/material-ui correctly identifies releases (it doesn't blindly take HEAD as the most recent version).
I have redirected the library to the other one
Why, does it have anything missing?
If you were asking me, yes it does, because a lot of bevy documentation is generated using
cargo docwhich uses comments from code which context7 doesn't read.
I have redirected the libraries
Hi there,
first of all, thank you for your stunning work with context7.
I regularly use llamaindex (TypeScript in this case), which has two entries which only differ in the .git ending of the GitHub link, both resolve to the same repo: https://context7.com/run-llama/llamaindexts.git https://context7.com/run-llama/llamaindexts
What do you think, could a check on the https://context7.com/add-library?tab=github page for a trailing .git string help reducing duplicates? Assuming links should not be submitted with a trailing .git, the page could strip the string, check for an already existing lib and guide the user to said entry.
yes we already do this but these entries might be added earlier than our check. will remove the one with the .git extension thank you!
Multiple instances of microsoft_learn
Multiple instances of .NET
https://context7.com/mui/base-ui is called "MUI Base UI"
But the correct name is "Base UI". Can this be updated? We might add a context7.json file https://github.com/mui/mui-public/issues/555 but for now, it seems that it's better solved in the database directly.
This is also a duplicate of https://context7.com/llmstxt/base-ui_llms_txt. Considering that it seems Context7 recommends GitHub over llms.txt source, we can keep the GitHub source one.
I have changed the name of the library. For the llmstxt version, it seems to have more content than the github library but would you still prefer it to be removed? @oliviertassinari
Hey there! It seems like duplicates: https://context7.com/websites/airflow_apache and https://context7.com/websites/airflow_apache-apache-airflow-stable
removed the smaller one @Siddha911
I have recently been using Arrow-kt a lot, and I noticed that context7 had only this github configured for the library - https://context7.com/arrow-kt/arrow
Unfortunately that Github repo does not have a lot of documentation and most queries for practical uses cases fail.
I added two improved sources of documentation
- https://context7.com/websites/arrow-kt_io_learn (the main docs website for arrow-kt)
- https://context7.com/websites/apidocs_arrow-kt_io (the API docs for arrow-kt)
However, I've noticed that agents like Codex and Claude always go for the first github library, which always results in useless context being given to it.
I believe the github lib should be removed, and either the docs site or the api docs should be kept. I don't know what the benefit of keeping one or both is. It would be nice of their context could be merged into one somehow 🤷 . The API docs are linked from the main site, but because the domain is different I think they don't get picked up indexing.
we can redirect the github one into the main docs website so that when someone clicks, they will be automatically redirected to the website version and it will not be visible in mcp. so your agents can't choose it. As an alternative to merging the libraries you can tell your agent to make two queries with smaller token limits, but yeah we index based on domain and subdomains are not added to the main domain
I think the redirect is a good idea, basically it seems like agents will go for the simpler and more straightforward library id - arrow-kt/arrow as opposed to the other ones, which makes sense. I think that should fix the "default" experience, and then from there ofc if we need to query the API docs specific instructions can be given.
Redirected, thanks for reporting