DocsGPT
DocsGPT copied to clipboard
Develop a community website to centralize pre-trained docs
Hi everyone,
Just landed on this project, and i can already see the huge impact this project could have in software programming!
The idea would be to develop a community website where we could search for stores to download in order to import them directly in docsGPT. This would avoid people paying fees when a documentation for a given version has already been ingested by somebody else.
Another idea would be to have this community list directly integrated within the app. Or docsGPT could be hosted online directly.
That's exactly the task I'm working on since yesterday. Thank a lot for the issue. I'll add it to the roadmap, it's very descriptive.
I already have s3 with cloudfront that's public. Working on index for it.
Also in terms of user submissions I just created a Google form. Maybe you have any ideas? Cause at the end of a day we need a process in place to check if embedding are correct cause some people can use it a as very new version of spamming😂
One could create a separate github repo to store all the indexes, and the submission/integration could be done through PRs and GitHub CI to automate part of the checks ? I don't really realise if those checks should be done by humans or if it's something we could fully automate.
@bil0u I like that idea but you need to track .. depending on the size of the files it may require git-lfs repos https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-git-large-file-storage
https://github.com/arc53/DocsHUB
created this repo for this.
I think for checks we can use GPT to probably automate it hahaha. But for starters it has to be manual.
Also, we have to decide on the folder naming convention. One person suggested to use urls as folders. And i keep thinking that its probably a good idea.
Not against using urls as folder names, but it may become a bit cluttery with a lot of entries.
What about <language>/<library_name>/<version>/ as a folder prefix? Also, i'm wondering if multiple versions can be handled in one index or if each index has to be created for each existing version. Otherwise the <version> part is unnecessary.
Finally, what about adding a README.md alongside the added index ? It could be templated and used to reference some metadata around the added index file, like author, library url, version, etc..
Sorry, really busy right now, hmm yeah version has to be there, and if we have seperate indexes for seperate versions its also very convenient for users cause they can ask question about things that are in older versions.
My only concern is that people will start adding documentation for things solidity or maybe even python itseld, thus it will break this scructure
Most reasonable thing I can think now is probably some database that has an api endpoint for search and everything. Inside that database we have all the mappings. And maybe just dump the contents of a DB once in a while to the github
Having a real database would be ideal if the app goes in the direction of using a backend to facilitate future features development.
What i like about having a GitHub repo for this is it gives us several advantages:
- Easy and light setup
- Enhanced tracking of versions and collaboration
- A lot of the process could be automatized through GitHub Actions
Also for the naming, maybe a designated word could indicate that we talk about the language itself, allowing us to keep only the library name for directories. In case of python, options could be:
python/pythonpython/.langpython/_lang...
Advantages of using . or any special character as a prefix is that this specific directory would be at the top of listings in GitHub UI.
Edit: My idea of using a GitHub repo come from some similar uses cases like Chocolatey Packages or Brew Casks
Also @dartpain, happy to discuss about it anytime ! Feel free to contact me by the email available on my profile to organize a session!
is done now, I think at least as proof of concept it will work.