Add support to read from git/github/gitlab directly
Git support seems not directly way (we need to git fetch with --depth 1?)
For Github:
- Get a branch:
/repos/{owner}/{repo}/branches/{branch} - Get a tree:
/repos/{owner}/{repo}/git/trees/{tree_sha}
The problems is we need to fetch the whole index before we can read /abc/def.
Doesn't make sense. Let's close.
Actual I think this is fun enough to support partial read for git repo, with this feature, we're able to read some files from git repo without clone all, and no need to worry about cleaning local cache.
git has been support sparse-checkout, and here is the tech-doc for developers
Still, git protocol is really hard to understand, I'm not sure how to impl this.
Seems rust-lang's git2 doesn't support this yet?
For GitHub, we can fetch contents via https://docs.github.com/en/rest/repos/contents?apiVersion=2022-11-28
Seems rust-lang's git2 doesn't support this yet?
Yes, which means we have to impl the protocol by ourself. that's why I post the tech doc link
After a long time, I think I should explain why the implementation of this feature is difficult. In the Git V2 protocol, if we try to implement the list or stat method, in order to get the size of a remote object, we have to actually pull the object data to the local machine, which is too heavy for both the list and stat methods.
fsspec will clone the repo to local first :rofl:
There hasn't been enough interest so far, let's close for now. Thanks @DCjanus for the information.