conan
conan copied to clipboard
[feature] Consider enhancing conan.tools.files.get() to support Git repos in addition to compressed file formats
What is your suggestion?
Conan 2 introduced the ability to locally cache a CCI recipe's source content, which helps reuse CCI recipes outside of CCI, but it is still difficult to leverage these recipes with (a fork of) the underlying Git source repo without modifying / replacing the typical source() method. On the plus side, many recipes now contain boilerplate code that simply calls the conan.tools.files.get function with the appropriate arguments from the conandata.yml file. It would be nice if one could simply replace or augment the entries in the conandata.yml file to specify the URL of a git repo (e.g., https://github.com/madler/zlib.git) and the SHA of the commit at which to checkout the repo.
In pseudo code, this suggestion might look something like:
def get(conanfile, url, md5=None, sha1=None, sha256=None, destination=".", filename="",
keep_permissions=False, pattern=None, verify=True, retry=None, retry_wait=None,
auth=None, headers=None, strip_root=False):
if url.endswith(".git"):
git = Git(self) # by default, the current folder "."
git.clone(url=url, target=destination) # git clone url target
# we need to cd directory for next command "checkout" to work
git.folder = destination # cd target
git.checkout(commit=sha1) # git checkout commit
else:
<existing implementation of "get">
NOTE: There are more efficient ways to implement this functionality as suggested in https://stackoverflow.com/questions/31278902/how-to-shallow-clone-a-specific-commit-with-depth-1
And, of course, a nice accompanying request would be the ability specify (perhaps through an environment variable) an alternate location for the conandata.yml file so as to eliminate the need to even modify that aspect of a recipe.
I am interested to hear your thoughts on this suggestion and to know whether such an idea has been requested by others or considered (I didn't see anything obvious when searching through existing issues). Thanks
Have you read the CONTRIBUTING guide?
- [X] I've read the CONTRIBUTING guide
one addition: this could also enable the automatic source backup if its a git checkout, which would be cool.
I agree this could be a nice addition, thanks for the suggestion @System-Arch
The only issue is that it seems it will not be able to prioritize it in the short term (marking it as long-term roadmap 2.X), there are many other higher priorities, and this one:
- is a nice to have, but not a blocker
- it is possible to implement it, just more verbose and require more custom lines in recipes
- Most Git servers like Github, Gitlab have http download URLs to download specific commits, tags, releases, and that already supports the backup sources and caching
- The implementation for this is not straightforward, specially the cache and backup, as this feature uses binary blobs with checksums as storage, which cannot be directly mapped to git clone/checkouts, so it will be necessary to invent some mechanism there.
I was just looking for a way to backup sources, where I have no control over the recipe source() method, but which uses a Git commit. Some thoughts:
Most Git servers like Github, Gitlab have http download URLs to download specific commits, tags, releases, and that already supports the backup sources and caching
Since some projects might also use Git LFS, this might be an issue, because it depends on the package provider if the LFS objects are included in tar balls: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/managing-repository-settings/managing-git-lfs-objects-in-archives-of-your-repository
The implementation for this is not straightforward, specially the cache and backup, as this feature uses binary blobs with checksums as storage, which cannot be directly mapped to git clone/checkouts, so it will be necessary to invent some mechanism there.
The problem is that in the recipe we need to know the key for the backup sources (from documentation):
In your recipe’s source() method, ensure the relevant get/download calls supply the sha256 signature of the downloaded files.
So in the case of a Git repository we might have in the recipe the commit ID to checkout which we could use as the sha signature (but this requires the full commit ID, not just the short one). But what if it's a tag. In that case without contacting the original Git repository I can't know the commit ID. But this would be required to get it from the backup sources.
Or we could generate a sha signature out of the repository URL + the tag or commit to checkout, then zip the checked out commit and save it under that sha signature in the backup source, so the next time somebody need the same repository URL + tag/commit it should have the same sha signature and could find it in the backup sources.