pip Ship pip as a standalone application

Actually, it occurred to me that we may even be able to do this right now. I put together a very simple proof of concept and it seems to work. If you put the following script alongside a "lib" directory with pip installed into it (pip install pip --target lib) but with the bin and pip*.dist-info directory removed (so the bundled pip isn't visible in pip list) then it can be run from any Python interpreter to effectively act as a copy of pip in that environment.

#!/usr/bin/env python

import runpy
import sys
import os

lib = os.path.join(os.path.dirname(__file__), "lib")
sys.path.insert(0, lib)

runpy.run_module("pip", run_name="__main__")

I don't think it would take much to turn this into a viable "standalone pip" application (I'd mostly just want to set up an executable wrapper for Windows). I've done some very basic testing - this would need a lot more real-world testing to make sure there aren't any problem edge cases, but it basically seems to work.

Originally posted by @pfmoore in https://github.com/pypa/pip/issues/11223#issuecomment-1179518843

For now, this is just a placeholder to discuss whether we want to do this at all, or how we'd distribute it. The main point here is that with a script like this, there would no longer be a need to install pip in every virtual environment.

One thing we'd have to work out is what tools assume that pip is available in every environment. I'm thinking of environment managers and IDEs, like nox, or VS Code. The ecosystem implications here are likely to be more complicated than the technical issues. Maybe we need to start with a heads-up discussion on Discourse? But before we do that I'd like to make sure the pip committers are all on board with the idea...

Jul 09 '22 10:07 pfmoore

Very interesting approach.

Jul 09 '22 11:07 sbidoul

There will be teaching implications too (undoing years of python -m pip).

Jul 09 '22 11:07 sbidoul

The script will need to check python version compatibility.

Jul 09 '22 11:07 sbidoul

There will be teaching implications too

At least initially, this can be an alternative, rather than a replacement. But absolutely, this is a significant change in approach. Which is why I think it needs to be flagged in advance. I'd post a topic on the Packaging discourse right now, but I'm frankly scared of the controversy it'll probably cause 😨

Jul 09 '22 11:07 pfmoore

I mean, we can also ship pip as a zipapp. IIUC, that should still not be visible on pip list and, it's literally a python pip.pyz ... which would be equivalent to python -m pip.

That's easy to communicate as well. :)

Jul 09 '22 12:07 pradyunsg

More broadly though, I'm on board. :)

Jul 09 '22 12:07 pradyunsg

I mean, we can also ship pip as a zipapp.

This is true. I'm not sure we can simply zip up the pip directory and call it a zipapp, but we can certainly ship a zipapp containing the script I posted above plus a copy of pip.

Do we know if all of our dependencies work when shipped as a zipapp (I believe requests didn't like the certificate file being in a zip at one stage, but IIRC that's fixed now)? Also, does the mechanism we use for injecting pip into a build environment work from a zipapp?

Shiv gets round this by creating zipapps that extract themselves on first use. I don't know if we want to go that far.

Otherwise, the main things that annoy me about zipapps are (1) python pip.pyz doesn't search PATH, and (2) .pyz files aren't registered on Windows to run from the command line by default (they need to be added to PATHEXT), and even when they are we have the old problem that nothing but an exe file is a "first class citizen" 🙁

As an initial step in this direction, though, we could ship a .pyz - virtualenv does it, and I'm pretty sure a couple of other tools do as well, so it's not an unfamiliar model to people. We could then promote the idea as "if you don't want to install pip in all of your environments, you can use the zipapp version (and use --no-pip when creating virtualenvs").

That's something I'd be comfortable announcing as a plan on Discourse...

Jul 09 '22 13:07 pfmoore

I think the main hurdle toward shipping a standalone application (versus a zipapp) is source build. If someone needs to build something from source, it's likely they'll want to build against an existing Python installation, instead of the interpreter bundled in the standalone executable, and that'll need some additional mechanism.

Wheel-only installations should be more or less plausible. The only reason thing we need to deal with (that I can think of) is console script shebangs.

Jul 09 '22 14:07 uranusjr

The key here is that the standalone executable doesn't bundle an interpreter[^1]. That's basically what the /usr/bin/env python shebang achieves. It runs the included pip in the environment's own Python.

[^1]: Or if it does, it executes pip with the installation interpreter, not the bundled one. But that's harder (not impossible, but a bit more fiddly).

Jul 09 '22 14:07 pfmoore

How to upgrade pip is going to be a topic. pip install --upgrade pip is not going to do what people expect.

If we want to be fancy, the script could have a mechanism to download the latest pip for the corresponding python version.

Jul 10 '22 08:07 sbidoul

Initially, I'd prefer to just publish a zipapp at https://bootstrap.pypa.io, like virtualenv does. Users can download that to get the latest version. Maybe we could also also publish it as a github release for people who want a specific version. I'd leave installers and upgraders to the community to provide, if they want (on Windows, for example, scoop and chocolatey can handle this, and on Linux distro packagers fulfil that role, I guess).

Agreed that pip install --upgrade pip will be confusing, but I'm not sure there's much we can do about that, apart from have a gradual transition. Maybe we could add a warning to pip so that if it detects that it's not running from the location that it will upgrade, we let the user know? That might be useful in any case, not just for this situation.

Jul 10 '22 10:07 pfmoore

Noting this here, so that we don't forget -- we'd want to update the upgrade prompt, to be aware of the zipapp based workflow and behave differently. What that different behaviour should be is something I don't have an opinion on, and I don't intend to think about that until we get somewhere in the discussion. :)

Jul 10 '22 10:07 pradyunsg

An interesting future capability could be that pip would no longer have to vendor as it could be isolated from the target environment,

Jul 10 '22 10:07 RonnyPfannschmidt

I don't think we'd get to that point, not in the order of decades -- we're still going to allow installing pip in environments, so the core reasons for vendoring will continue to exist.

Jul 10 '22 10:07 pradyunsg

Agreed. A zipapp version of pip could debundle, but there's no point unless we drop support for installing pip in environments.

Jul 10 '22 11:07 pfmoore

I don't think we could debundle even in the zipapp -- it'd still be possible to have a version of requests/urllib3 (for example) in the environment that won't work with whatever version of pip is being used via a zipapp.

Jul 10 '22 11:07 pradyunsg

For what it's worth, I've just created https://github.com/pfmoore/runpip

The build script is there, and I've published a 22.1.2 release that has the pyz as a downloadable asset. If people want to play with it, go ahead. I think I'm going to make it my default pip locally and see how that works out.

Jul 10 '22 11:07 pfmoore

I don't think we could debundle even in the zipapp

Ah, I was thinking of debundling but still shipping all of the vendored libraries in the zipapp. Yeah, working with locally installed copies of our dependencies is never going to work.

Jul 10 '22 11:07 pfmoore

Yea, I'm not sure what would take precedence in the import paths -- but we know vendoring works and we need it for our primary usecase today anyway. Let's table this -- we're all on the same page I think. :)

Jul 10 '22 11:07 pradyunsg

I just added an option to the test suite to run pip from a zipapp (specifically, script.pip runs the zipapp, not the installed pip). For the integration tests[^1], I got

69 failed, 775 passed, 38 skipped, 6 xfailed, 2015 warnings

Not that bad, actually. And from a quick scan, many of the failures look like either assumptions about the location of the running pip, or "unexpected changes" caused by the extraction of cacert.pem to a temporary directory. So overall, that's relatively strong evidence that the zipapp is functional. At some point I'll try to work through the test failures, but for now I don't consider passing the test suite to be a necessary condition for publishing an experimental zipapp, if we choose to do so. Does anyone disagree?

Edit: FWIW, without using the zipapp, I get the following on my machine:

10 failed, 834 passed, 38 skipped, 6 xfailed, 2015 warnings

I believe the 10 failures are due to git on my PC being configured with init.defaultBranch=main and some "filename too long" errors. So 59 possible issues to investigate and confirm that it's the test, not the zipapp, that's at fault.

[^1]: I assume the unit tests probably don't use script.pip much, if at all.

Jul 11 '22 14:07 pfmoore

#11248 fixes one of the problems (28 failures), getting us down to 41 failures (31 if you ignore the 10 unrelated ones).

Jul 11 '22 15:07 pfmoore

Most of the rest are down to the unexpected existence of cacert.pem in the temporary directory. I fixed this by allowing scripttest to ignore that file when running from a zipapp.

I'm going to finish on this for today, but I think we're most of the way there now.

The biggest outstanding task is working out a way to automatically build an up to date zipapp when running the tests with --use-zipapp. For that, I ideally need to be able to build a wheel of the pip code under test. Of course, I don't want to build that wheel with the pip under test itself, in case it's broken... And looking at the test suite, I'm not even 100% sure I know of a reliable way of finding the code under test - the only copy I think I can rely on existing is the one installed in the test environment's site-packages. I suppose I could read all the installed files by starting from pip.__file__, but that seems pretty awful...

Does anyone know a good way of building a wheel of the pip under test from the running test suite? Am I overthinking this, and there's a simple answer I'm missing?

Jul 11 '22 19:07 pfmoore

I assume the unit tests probably don't use script.pip much, if at all.

They're not allowed to. :)

Jul 11 '22 19:07 pradyunsg

I was nerd sniped by this, so I created https://github.com/sbidoul/pip-launcher, which automatically downloads the correct pip version using get-pip.py (python 2.7+). I've symlinked that as pip in my PATH and I'll see how it goes.

[update] renamed from pip-script to pip-launcher

Jul 16 '22 12:07 sbidoul

lol, nice. We're going to end up with a whole raft of different approaches to running pip without installing it. I have had pip.pyz installed as pip in my path for about a week now, but I think that in order to get a proper feel for how well it works, I need to configure virtualenv (and pew, if I can work out how to do that as well) to default to --no-pip --no-setuptools --no-wheel. It's probably just some environment variables to set.

What I plan on doing over the next week sometime (it's been busy this week) is to put together a post on the packaging Discourse, saying something along the lines of

The pip team are experimenting with alternative deployment methods for pip, which avoid the need for pip to be installed in every environment. We're aware that this will be a pretty big change in what people can expect, as there is currently a strong assumption that pip will be available in every Python environment. So we'd be interested in any feedback on how this could affect people's workflows, or tools. To be clear, we're not expecting to change the official deployment method in the short term, but we will be offering (and supporting) other approaches, and we'd like to get a better feel of the impact so that we can determine how to plan the rollout and how to frame the announcements.

Does that seem OK to people? Do you want me to post a draft somewhere so that the @pypa/pip-committers can review the post before I make it?

Jul 16 '22 13:07 pfmoore

I’m fine with this wording and don’t think to put this somewhere for edits. In any case, I don’t feel strongly about the phrasing of the post and am happy to defer to others on that. It might make sense to link to this issue as well — again, I trust your judgement on whether that’s useful.

Jul 16 '22 13:07 pradyunsg

probably just some environment variables to set.

Setting VIRTUALENV_NO_PIP to 1 does the job.

Does that seem OK to people?

Fine with me. No need to review AFAIC.

Jul 16 '22 14:07 sbidoul

FWIW, I've long thought it would be a great idea if pip stopped introspecting the current environment, and instead supported a CLI flag to target a specific environment (which then defaulted to which python).

Doing that, would mean you could use something like pyoxidizer to ship a whole Python with pip, including things like statically compiled extensions and what not.

Jul 17 '22 13:07 dstufft

One thing we may need to consider is pip "plugins". We don't have those formalized today (although we may in the future), but some pip feature already try to import packages to enable themselves (such as the new truststore feature flag, or keyring support). So some mechanism to make additional packages available to the pip launcher may be necessary - it could be as simple as inserting them in sys.path too. Coming up with a good UX for that may be more challenging, though.

Jul 17 '22 13:07 sbidoul

FWIW, I've long thought it would be a great idea if pip stopped introspecting the current environment

Agreed. I think there's an issue somewhere for this, but it's a more complex change. For now, I think a zipapp that runs in any environment is a useful starting point, as it breaks the implication that pip is present in every environment (which I suspect will be the big hurdle for some people).

This has been on my "if I get round to it" long term plan for ages, as well 🙂

some pip feature already try to import packages to enable themselves

Again, for now I'm personally fine with the idea that such packages need to be installed in the target environment (or the user sets up $PYTHONPATH to make them importable from a shared location). At some stage, I think we need to bite the bullet and decide what we want to do about "plugins" (either pip features gated on the presence of certain modules, or fully-independent plugins) but again, that's a much bigger question.

to the pip launcher

However, be aware that I'm thinking here about the simple "zip all of pip up into a pip.pyz" approach that I'm working on for inclusion in 22.3. A more full-featured "pip launcher" that adds features like enabling plugins, etc, could have a much more complex UI, but I'm not sure it's necessary at this point.

Jul 17 '22 14:07 pfmoore