pip
pip copied to clipboard
pip freeze with a hash
- Pip version: 9.0.1
- Python version: 3.5.4
- Operating system: Debian / PureOS
Description:
User story: I am a Python developer with an existing requirements.txt file. I want to add hashes to the file, so that future installations are more secure.
What I've run:
At the moment I need to:
- Locate the package.tar.gz or package.whl
- Run
pip hash /path/to/package - Copy the result into
requirements.txt - Repeat for every package
It would be great if instead I could:
- Run
pip freeze --hash - Get pip-formatted output with all package names and their hashes
- Copy the result into
requirements.txt
Today's solution:
Pipfile is a replacement for requirements.txt that includes hashes in a file called Pipfile.lock.
pipenv is a tool for managing your virtualenv based on Pipfile, including checks against the hashes defined in Pipfile.lock. (It can also convert a requirements.txt file.)
Suggested solution:
Supporting Pipfile at the pip layer (rather than a higher-level tool) is on the PyPA roadmap, see https://github.com/pypa/pipfile#pip-integration-eventual :
pip will grow a new command line option, -p / --pipfile to install the versions as specified in a Pipfile, similar to its existing -r / --requirement argument for installing requirements.txt files. ... To manually update the Pipfile.lock:
$ pip freeze -p different_pipfile different_pipfile.lock (73d81f) written to disk.
The implication is that this is the preferred solution to supporting hashes (rather than adding them to requirements.txt or pip freeze). The current status "Deferred till PR" (see this ticket). See also https://github.com/pypa/pip/issues/6925
Is there at least some way to easily script this? E.g., can I loop over a pip freeze and somehow programmatically find the file I need to pass to pip hash?
PIP would need to calculate and keep the hash somewhere as it installs the package. When doing a freeze, it'd retrieve the information.
This would be an awesome feature, indeed.
This sounds like a good idea, although I am not sure how it'll work. As @max-wittig pointed out, the hash needs to be computed when the installation occurs, when the installation source is downloaded.
You can get the hash from the cached wheel in ~/.cache/pip/wheels/
It looks like pipenv is getting the hashes directly from the warehouse api
https://github.com/pypa/pipenv/blob/master/pipenv/utils.py#L468-L508
The fact this doesn't exist is just terrible. I just wrote a workaround:
https://github.com/andrewchambers/mummipy
enjoy.
@andrewchambers perhaps instead of the slight barbs consider sending a PR?
It appears that this user story (Python developer wanting to hash their dependencies) is addressed by pipenv, a distinct PyPA project. See https://docs.pipenv.org/basics/#pipfile-lock-security-features for details. So I'm closing this issue assuming that this user story is out-of-scope for pip itself, and best handled by a "higher-level" tool.
Other readers also might be interested in:
Are you saying you think the entirety of pip freeze is out of scope for pip now?
Because if not, this seems like a very logical thing for pip. Not all of us use any current higher level tool, and it's pip itself that introduced the possibility of having hashes in requirements files.
Without this feature it's pretty unfeasible to generate those.
Saying "patches welcome" seems very reasonable, but closing not so much.
On Mon, Aug 6, 2018, 17:26 d❤vid [email protected] wrote:
Closed #4732 https://github.com/pypa/pip/issues/4732.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pypa/pip/issues/4732#event-1773030507, or mute the thread https://github.com/notifications/unsubscribe-auth/AAUIXkYXs7_gKTjNsPIAurvfoyr6k-Gkks5uOFIagaJpZM4Pduuk .
@Julian note that it was the OP who closed the issue, not the pip developers. The option for someone to create a PR for this remains available to anyone interested in the feature.
Ah, indeed, thanks!
Great, glad to hear it's not being designated as out of scope.
See the proposal for a pip freeze -p pipfile command at https://github.com/pypa/pipfile#pip-integration-eventual , which directly solves this user story for pip. I've reopened this ticket because it is clearly on the (long-term) roadmap for pip.
I've updated the ticket description with the proposed solution (as I understand it). Note that Pipfile-based dependencies are usable today if you use pipenv.
See today's convoluted workaround at your handy https://github.com/peterbe/hashin/issues/100
I just switch to Pipenv, which supports this workflow. Sadly it's still not included in the default python package.
Is there any roadmap or concrete discussion about implementing the proposed -p / --pipfile option that may replace the -r option in the long run? I'm having a hard time to find this.
This will generate a requirements.txt with hashes
pip-compile requirements.txt --generate-hashes
Note that this will directly modify existing requirements.txt file.
You can install pip-compile with pip install pip-tools
would still be good to have this directly via pip freeze instead of having to use other tooling; and pipenv and Pipfile comes with their own set of headaches.
pip freeze --hash will be very useful.
This sounds like a good idea, although I am not sure how it'll work. As @max-wittig pointed out, the hash needs to be computed when the installation occurs, when the installation source is downloaded.
If someone wants to file a PR implementing this, they're welcome to do so! Note that we'd like to see this functionality in pip, but the PR would be subject to our regular code review processes (i.e. we're not gonna merge a PR just because someone filed it).
I've labelled this issue as an "deferred till PR".
This label is essentially for indicating that further discussion related to this issue should be deferred until someone comes around to make a PR. This does not mean that the said PR would be accepted - that decision has been deferred until the PR is made.
@pradyunsg @deveshks I am thinking about implementing this PR. I think that hashes should not be obtained from remote, and so must be computed at install time (maybe fallback from cache?)
How about adding a new file to the wheel metadata that, similar to RECORD, that will contain the pre-installed hash of the package? Never did that but I saw that it is not a lot of diff and a simpler solution. This will allow pip freeze to quickly list hashes of installed packages, however there is no support of listing hashes of packages that were installed with older versions of pip. Another problem is that this behavior is different from the behavior of different tools that compute hashes for packages, which they usually get from a remote PYPI, which seems like an unsafe option overall.
I will start to work on this in the following days, let me know what you think about my idea :smile:
@NoahGorny cool you want to work on this.
You'll also want to pay attention to the wheel cache. When installing something that has been built (e.g. an sdist) and cached as a wheel, we probably want the hash of the original sdist or direct url target, and not the hash of the wheel that we have in cache.
What about generating the lock file at install-time, like npm, yarn, pipenv, poetry, Cargo, and Conan do? (sorry if I missed any)
pip install -r requirements.txt --lock requirements.txt.lock- Commit
requirements.txt.lock - Afterwards, invoke
pip install -r requirements.txt.lock
On updates to requirements.txt, do the same steps.
This directly supports with the stated use case:
I am a Python developer with an existing
requirements.txtfile. I want to add hashes to the file, so that future installations are more secure.
but it avoids a lot of the extra work that is being described in #8519.
@NoahGorny WRT hashes:
- get the hash from the remote
- generate the hash locally
- verify the hashes match
Don't trust remote, but don't necessarily trust local either. Verify they match to give the user assurance that the right thing is installed. If they don't match, generate an error; provide a way to force the install if they don't match, but by default uninstall/rollback if they don't match (or depending where/when you're generating the hashes...don't install to start with, which would be even better).
@NoahGorny WRT hashes:
* get the hash from the remote * generate the hash locally * verify the hashes matchDon't trust remote, but don't necessarily trust local either. Verify they match to give the user assurance that the right thing is installed. If they don't match, generate an error; provide a way to force the install if they don't match, but by default uninstall/rollback if they don't match (or depending where/when you're generating the hashes...don't install to start with, which would be even better).
We can not always generate the hash locally after installation, that's why we create the new HASH file. However, I am not sure we should fetch hashes from remote each time we freeze the environment...
What about generating the lock file at install-time, like npm, yarn, pipenv, poetry, Cargo, and Conan do? (sorry if I missed any)
1. `pip install -r requirements.txt --lock requirements.txt.lock` 2. Commit `requirements.txt.lock` 3. Afterwards, invoke `pip install -r requirements.txt.lock`On updates to
requirements.txt, do the same steps.This directly supports with the stated use case:
I am a Python developer with an existing
requirements.txtfile. I want to add hashes to the file, so that future installations are more secure.but it avoids a lot of the extra work that is being described in #8519.
This requires users to actively generate lockfiles in installations, and only works if the user is installing from requirements file in the first place. This is a good option for such users, but in other use cases I think it does not work just as well
The approach suggested by @chrahunt in https://github.com/pypa/pip/issues/4732#issuecomment-657898593 is also valuable in a lot of situations. It has complexities to think through too, for instance when the install command is used to update an existing environment, and when pip decides it does not need to reinstall some already installed dependencies. In such cases we'd still need a way to obtain information about the hashes of installed distributions.
This requires users to actively generate lockfiles in installations, and only works if the user is installing from requirements file in the first place. This is a good option for such users, but in other use cases I think it does not work just as well
This use case from the original issue assumes we have a requirements file, and several comments refer to Pipfile support, which would work in the same way. I think there may be some people who would want to get their environment set up and then generate a lock file for it, but IMO we risk not actually satisfying this issue adequately if we try to solve that one at the same time.
It has complexities to think through too, for instance when the install command is used to update an existing environment
Good point. It would be worthwhile to see how other dependency managers behave in that situation. If it turns out it's common (and generally agreed to be necessary) to store hashes with the installed packages, then that could be turned right around and included in the PEP itself. :)
@chrahunt to give confidence that the right thing is being installed; I would think you'd want something generated before it's installed that could easily be verified.
Question: what all is getting hashed? (or being proposed to being hashed)