pip-tools
pip-tools copied to clipboard
Add compliation check mechanism
I'd like to be able to run a command in CI to ensure that my requirements.in
produces my requirements.txt
, mostly as a guard against accidental edits to the latter which get undone when pip-compile
is next run.
I'm currently using pip-compile --dry-run
with some bodging of the output then run through a diff
against my requirements.txt
to enable this, however it's (predictably) not exactly reliable (sometimes I get warning text in the output which wouldn't be in the requirements file).
I can see a couple of solutions:
- make
pip-compile
output the requirements.txt toSTDOUT
, rather thanSTDERR
, so that there's a stream which I can use which is sure to be free of other text - make an actual check mode
The latter feels like the much better solution, though it seems likely to be more effort. I've no idea what the former would break though. (pip-compile
seems not to use STDOUT
at all?)
I'm using Python 3.5 and pip-tools 4.1.0.
Hello @PeterJCLaw,
Thanks for bringing this up. I'd prefer to do it in a way of "Single Responsibility Principle":
$ pip-compile --quiet && git diff --exit-code
If check is failed you can see the changes of requirements.txt
.
That's definitely nicer than my previous approach, though it does assume that the requirements file is under version control (I agree this is likely), which feels like an odd restriction, and that no other changes have been made to the working area.
As far as I know in pip-compile
can only operate on an output file which is the same as it's reference/lock file (I'm not sure what term you use for that). If there were a way to have a different input file than the output that might help here. (This is kinda where I was going with suggesting the change to --dry-run
's output). That said, I'm not actually sure that this is a good idea either (and introducing it just for this case feels a little contrived).
I can see that it would be nice to defer the comparison part to another tool, however one could argue that the --dry-run
behaviour is already headed down this path -- why would you need --dry-run
when you can can just run pip-compile
and then reset the file from version control if you don't want the changes? (There's even an argument that that's preferable to --dry-run
because you can easily diff the file)
In any case, git diff --exit-code
works for my use-case, so I'm happy to use that for now.
@PeterJCLaw
In any case,
git diff --exit-code
works for my use-case, so I'm happy to use that for now.
I'm glad to help you. Anyways, i like the idea of the optional check mode, so let's leave this issue open and see whether the community is interested in this feature.
I use tox
to run my autoformatters (and pip-compile), then linters and tests. In local development that means "fix everything the check it", but in CI it's followed by git diff --exit-code
to check that whatever was committed is good.
So count me as a vote for single responsibilities!
In any case,
git diff --exit-code
works for my use-case, so I'm happy to use that for now.
It seems I spoke too soon. While this works for the case I was looking at when I reported this issue (in an open source repo of a small project), it doesn't in a case I've been looking at at work.
For the build of one of our (fairly large) internal repos we have a multi-stage CI pipeline which caches the source files into the job workspace after the initial git checkout for speed, but doesn't persist the repo as part of the cache. As a result when we come to run this check, there isn't a git repo to use for diffing.
I could probably copy the requirements.txt
out the way first, run a non-dry-run and then diff the copy against the new file, but that's definitely starting to feel like I'm bodging things again.
It would be great if there was a way to validate a pair of requirements files without relying on a version control repo.
There's another way — use pre-commit, see #974.
Also related to the PR #1070 (which is currently under discussion) with the pip-check
mechanism.
While this doesn't help in the case where the needed git data isn't present, I'll point out that you can pass specific files, like
$ git diff --exit-code -- requirements.txt
so the repo doesn't need to be totally pristine.
When you can't count on git, an alternative comparison method is to create a hash beforehand and verify it after:
$ md5sum requirements.txt >requirements.txt.md5
$ pip-compile
$ md5sum -c requirements.txt.md5
There is a verify
command in pip-compile-multi (a tool based on pip-compile) which works similarly to @AndydeCleyre 's suggestion with MD5, however it's based on the hash of the input file.
pip-compile-multi
adds an additional top line to the lockfile with the SHA1 of the orignal .in
file, and pip-compile-multi verify
simply checks that SHA1.
I think adding similar functionality to pip-compile
could be backward compatible and useful too.
Edit: I just realized there are complications around handling requirements files that import other requirements files, because the imported files might change, so they will need to taken into consideration as well.
Also, if an already pinned version gets revoked from the repository, the same input would generate new output. I'm guessing we'd want the check to "fail" in this case, rather than just confirming that the input is unchanged?
Also, if an already pinned version gets revoked from the repository, the same input would generate new output. I'm guessing we'd want the check to "fail" in this case, rather than just confirming that the input is unchanged?
Good point, if the version is revoked the previous output file would be unusable.
For anyone who comes across this issue, I came up with a solution that doesn't have much to do with resolving pip dependencies. I just hash both the input and output files and store the hashes in version control. This way if either of them changes, I will be able to tell in CI by comparing the hashes to the files.
def get_file_hash(file: str) -> str:
with open(file) as f:
s = '\n'.join(line.rstrip() for line in f)
return hashlib.md5(s.encode()).hexdigest()
The rstrip
is mostly because of different line endings in different OS's. I want to make sure the hash is the same wherever the repo is cloned.
The full script can be found here.
I'd like the hashing idea to be supported in pip-compile natively. That's because we use pyproject.toml
for our dependencies, and I don't want to hash all of pyproject.toml
. It would be ideal if pip-compile
could extract dependency specifications from the inputs, subject to --extras
, constraints, etc, hash ONLY those, and write the hash into the compiled requirements.txt
as a comment, near the This file is autogenerated by pip-compile
line. This would allow checking, and fast re-runs if nothing has changed.