quivr icon indicating copy to clipboard operation
quivr copied to clipboard

Can you support uploading GitHub repolinks?

Open superbayes opened this issue 1 year ago • 14 comments

Can you support uploading GitHub warehouse links? such as: https://github.com/panjh/dxflib.git

superbayes avatar May 25 '23 01:05 superbayes

Yea ! That would require a new kind of loader. Interesting in doing it ? ;)

StanGirard avatar May 25 '23 06:05 StanGirard

Yea ! That would require a new kind of loader. Interesting in doing it ? ;)

For programmers, I think this will be a very useful feature Imagine your boss quickly familiarizing you with a niche source code library, but there are few online reference materials Therefore, it's great to have GPT help us read the source code

superbayes avatar May 25 '23 06:05 superbayes

Agreed 👍

StanGirard avatar May 25 '23 06:05 StanGirard

Would this only load the README in the root of the project and save it, or more than that?

gogo-ashacode avatar May 25 '23 16:05 gogo-ashacode

There is a git loader in langchain that could be used.

StanGirard avatar May 25 '23 16:05 StanGirard

@StanGirard I'd be up to help with this.

I want to think a bit about what this implementation might look like (which file extensions / folders we might want to exclude by default in the import, etc.), but I can post a plan of attack here this weekend and get feedback before starting on it.

mattlebel avatar May 26 '23 23:05 mattlebel

The implementation would be:

Feed the url In the crawl endpoint if the url has GitHub in it and then finishes with .git then we load the GitHub repo :)

StanGirard avatar May 27 '23 06:05 StanGirard

Ah, you're 100% right - we can just trust that the user has made decisions via their .gitignore and take all files.

Any objections to also looking for gitlab.com URLs in addition to github.com?

I am starting on this

mattlebel avatar May 27 '23 15:05 mattlebel

Exactly ! There is even a git loader if interested.

https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/git.html#clone-repository-from-url

StanGirard avatar May 27 '23 15:05 StanGirard

Please let me know if I should create a separate issue for this:

Uploading Github repos is not working for me, assuming I am doing it correctly.

If I drag and drop the repo as a file or enter it as a url to be crawled, I see an error in the backend saying '402 Payment Required.'

pic3

I am running quivr locally.

JarrodWoodard avatar Jun 08 '23 16:06 JarrodWoodard

Ah, you're 100% right - we can just trust that the user has made decisions via their .gitignore and take all files.

Any objections to also looking for gitlab.com URLs in addition to github.com?

I am starting on this

An update on my end - while I still plan to finish this, I have not gotten the gitloader functionality built yet. This request to add the functionality remains open for now.

mattlebel avatar Jun 08 '23 16:06 mattlebel

I'm look forward to this feature and dropping the quivr repo into it. I'll be a lot better at helping out and providing feedback afterwards. :)

JarrodWoodard avatar Jun 09 '23 22:06 JarrodWoodard

I tried this feature again seeing as their was a recent update. I tried crawling the quivr repo so I can hopefully not have to bug others with dilettante questions and maybe be able to contribute here and there.

It looked like it worked in the end but their was a message in the browser that said it failed.

JarrodWoodard avatar Jun 12 '23 03:06 JarrodWoodard

Just an update - I am no longer actively working on this in case someone else would like to grab it.

mattlebel avatar Jul 03 '23 15:07 mattlebel

Thanks for your contributions, we'll be closing this issue as it has gone stale. Feel free to reopen if you'd like to continue the discussion.

github-actions[bot] avatar Aug 22 '23 16:08 github-actions[bot]

Thanks for your contributions, we'll be closing this issue as it has gone stale. Feel free to reopen if you'd like to continue the discussion.

github-actions[bot] avatar Sep 22 '23 20:09 github-actions[bot]