pip icon indicating copy to clipboard operation
pip copied to clipboard

Providing pip configuration in `sys.base_prefix`

Open pelson opened this issue 3 years ago • 35 comments

pip version

main

Python version

all

OS

all

Additional information

When a pip config is installed in an installation prefix $PREFIX/pip.conf, $PREFIX/bin/python -m pip correctly picks up the config. When one makes a virtual environment with $PREFIX/bin/python -m venv ./my-venv then ./my-venv/bin/pip does not pick up the config.

It is questionable if this is a bug or a feature request, but essentially, I believe that pip should be looking in sys.base_prefix as well as sys.prefix for a config file.

Description

No response

Expected behavior

No response

How to Reproduce

Starting with a non-virtual environment (e.g. a conda environment):

$ touch ./env/pip.conf
$ pip config debug
env_var:
env:
global:
  /etc/xdg/xdg-ubuntu/pip/pip.conf, exists: False
  /etc/xdg/pip/pip.conf, exists: False
  /etc/pip.conf, exists: False
site:
  /media/important/github/pypa/pip/env/pip.conf, exists: True
user:
  /home/pelson/.pip/pip.conf, exists: False
  /home/pelson/.config/pip/pip.conf, exists: False

$ python -m venv ./venv

## BEWARE THAT YOUR VENV HAS THE BUNDLED PIP, SO INSTALL A NEWER PIP FOR DEBUGGING
$ ./venv/bin/pip install -e /path/to/pip/repo

$ ./venv/bin/pip config debug
env_var:
env:
global:
  /etc/xdg/xdg-ubuntu/pip/pip.conf, exists: False
  /etc/xdg/pip/pip.conf, exists: False
  /etc/pip.conf, exists: False
site:
  /media/important/github/pypa/pip/venv/pip.conf, exists: False
user:
  /home/pelson/.pip/pip.conf, exists: False
  /home/pelson/.config/pip/pip.conf, exists: False

What I want to see for a venv is that the base_prefix is searched as well as the prefix.

Output

No response

Code of Conduct

  • [X] I agree to follow the PSF Code of Conduct

pelson avatar Apr 01 '21 15:04 pelson

Hi! Thanks for filing this issue and PR!

Could you please elaborate on why you want this? What use case does this help with?

pradyunsg avatar Apr 01 '21 16:04 pradyunsg

I think this is by design? A virtual environment is supposed to isolated things from the global environment, so configuration from sys.base_prefix should not be picked up.

uranusjr avatar Apr 01 '21 17:04 uranusjr

Could you please elaborate on why you want this? What use case does this help with?

Sure, I'll try with my specific case, and then hopefully we can draw parallels elsewhere.

I an the responsible for a Python distribution to a controlled network. There are other Python distributions on the controlled network (both accessible from the same machine) for which I am not responsible, and which need specific pip configurations. I too have a specific pip configuration (e.g. index-url), I therefore configure them in the pip.conf of the distribution. It is designed that users (developers) take virtual environments from this distribution in order to extend it as needed - pip is a fundamental part of that work flow to allow them to extend as required. I therefore want those users to have the appropriate pip configuration within their virtual environments (out of the box).

To re-iterate, it is not acceptable to have the config be global (e.g. in /etc/pip.conf) nor user enabled (e.g. in ~/pip.conf) as they need to also be able to use another Python distribution with different pip configurations (i.e. the pip config needs to be isolated to the environment).

Environment variables defining the config aren't acceptable because the two Python distributions wouldn't be able to inter-operate (you can only have one PIP_INDEX_URL.

To me the idea that a virtual environment should inherit the pip configuration from the base environment is natural. After all, a virtual environment inherits the Python and its standard library from the base environment - if you change the config of how Python is built such config propagates/leaks to the virtual environment.

pelson avatar Apr 01 '21 18:04 pelson

IMO, the only sort of environment that should inherit is a --use-system-site-packages one, as that's explicitly not isolated from the system environment. Certainly changing the default behaviour to inherit would be a significant backward incompatibility, and would likely cause problems for people who rely on global settings not leaking into a virtualenv (e.g., testing).

pfmoore avatar Apr 01 '21 18:04 pfmoore

Thank you for your input so far.

would likely cause problems for people who rely on global settings not leaking into a virtualenv (e.g., testing)

We should try to be more precise with the terminology here. global settings do leak into a virtualenv:

$ cat /etc/pip.conf 
[global]
no-cache-dir = true

$ cat ./venv/pip.conf 
[global]
no-dependencies = yes

$ pip config list
global.no-cache-dir='true'
global.no-dependencies='yes'

What I'm suggesting is, just like Python config (e.g. compile flags) leaks into a virtual environment, so too should the pip config. i.e. site configuration should leak into a virtual env.

To turn this discussion around a little, perhaps I could ask you how you would control a virtual environment's pip configuration if global, user, and environment settings are not acceptable (because (a) a pip config doesn't apply to all Python installations on a machine, (b) it isn't a user configuration to define how a Python installation should interact with the distribution)?

I'm trying to not be hypothetical here, so my concrete scenario:

  • 1 machine
  • No internet, but internal repositories exist
  • 2 completely different Python distributions
  • Each python distribution has different pip configuration requirements (e.g. index urls)
  • Users use virtual environments to give them extensibility

The problem occurs fairly quickly as users take a virtual environment from one of the distributions and they immediately have no pip configuration. Even worse, they try to use a PEP517 tool such as build to create a wheel python -m build ./ using the correctly (pip) configured Python distribution, but because the build isolation creates a new virtual environment the pip isn't configured and you get an error.

pelson avatar Apr 01 '21 19:04 pelson

@pelson I have to admit the scenario you brought up is interesting. :)

To turn this discussion around a little, perhaps I could ask you how you would control a virtual environment's pip configuration if global, user, and environment settings are not acceptable (because (a) a pip config doesn't apply to all Python installations on a machine, (b) it isn't a user configuration to define how a Python installation should interact with the distribution)?

After taking a quick look at docs of venv it looks like you could extend EnvBuilder class and define post_setup() method.

From https://docs.python.org/3/library/venv.html#venv.EnvBuilder.post_setup

post_setup(context) A placeholder method which can be overridden in third party implementations to pre-install packages in the virtual environment or perform other post-creation steps. (emphasis mine)

The post-creation step in your case could be creation of appropriate pip configuration file in the newly created virtual environment.

piotr-dobrogost avatar Apr 05 '21 09:04 piotr-dobrogost

After taking a quick look at docs of venv it looks like you could extend EnvBuilder class and define post_setup() method.

That's my recommendation for using build too, to extend virtualenv via its plugins- https://github.com/pypa/build/issues/270#issuecomment-812003023.

gaborbernat avatar Apr 07 '21 12:04 gaborbernat

Thanks for the suggestion of extending EnvBuilder. Indeed this does work to automatically configure a venv with pip.conf, and I've been running with such a tweak for quite some time. Unfortunately the python -m venv and python -m build invocations aren't something that you can override in this way (without monkeypatching during Python startup :speak_no_evil:).

Perhaps I could have been a bit clearer here: I'm not looking for a workaround per se (I have one, though I do appreciate ideas of other potential workaround solutions), I'm looking to address what I believe is a genuine shortfall (i.e. unexpected behaviour) in the way pip behaves with virtual environments.

That's my recommendation for using build too, to extend virtualenv via its plugins

I'm not using virtualenv, I'm using venv, the standard library way to make virtual environments. So too does build (https://github.com/pypa/build/blob/0.3.1/src/build/env.py#L188), no? There is no equivalent plugin system for venv.

the only sort of environment that should inherit is a --use-system-site-packages one

I disagree: I am not proposing that you inherit a single extra package with this change, merely that a locally (site) scoped pip.conf from a parent environment should be inherited in the child (virtual) one. I think considering configuration inheritance as the same as package inheritance (system site packages) would be a conflation of the --use-system-site-packages flag.

I'm trying to figure out real-world use cases where you wouldn't want to inherit the pip.conf from the parent environment. The one provided by @pfmoore "isolation testing" is a reasonable one, but I don't believe it should necessarily trump the one in which a user wants to create a virtual environment on a Python distribution which has been correctly pip configured and expect that their virtual env pip to "just work™".

In my PR (#9753) I implemented the easier approach of blending together the pip.conf in sys.base_prefix and sys.prefix (just like all of the other config search path items). It would totally solve my use case if, instead of doing a blend, it simply didn't look in sys.base_prefix if sys.prefix / pip.conf existed. This would solve the "isolated testing" usecase, as you would simply touch sys.prefix / pip.conf to avoid sys.base_prefix / pip.conf being considered. This may be best of both worlds - you can easily enough have isolated testing, but still get working behaviour out of the box.

pelson avatar Apr 11 '21 04:04 pelson

build uses virtualenv if it’s available, and falls back to venv when it’s not.

uranusjr avatar Apr 11 '21 05:04 uranusjr

"isolation testing" is a reasonable one, but I don't believe it should necessarily trump the one in which a user wants to create a virtual environment on a Python distribution which has been correctly pip configured and expect that their virtual env pip to "just work™".

Also, isolation is important from the point of view of bug triage - "can you reproduce your issue in an empyth virtualenv" is much easier than "can you reproduce your issue if you create a new virtualenv, hunt down and disable any config files you may have created in the past and forgotten about, ..."

But the real problem is that both are valid requirements. As a result, we're not discussing what the behaviour should be, but rather which behaviour should be default, and which should be opt-in.

Backward compatibility is a significant weight in favour of ignoring the site config by default.

IMO you've made some good arguments that your use case is worth considering, but your arguments are not strong enough to switch the default. If you want to continue arguing for your behaviour being the default, I suggest you focus on how to address the backward compatibility issue.

pfmoore avatar Apr 11 '21 08:04 pfmoore

I don't understand how #9753 is going to help with build inheriting a parent venv's configuration. When you create a venv from another venv, sys.base_prefix does not point to the parent venv's prefix; it always points to the installed Python's prefix. pip does not look inside sys.base_prefix I assume because (a) they would not want to encourage users to drop pip configuration directly in e.g. /usr and (b) because, in most cases, global configuration serves the same purpose.

layday avatar Apr 11 '21 10:04 layday

Also, isolation is important from the point of view of bug triage - "can you reproduce your issue in an empyth virtualenv" is much easier than "can you reproduce your issue if you create a new virtualenv, hunt down and disable any config files you may have created in the past and forgotten about, ..."

I'm only semi-convinced about this one. You still have to deal with peoples environment variables, global config and user config. In reality you'd tell them to run pip config debug and, if it existed, perhaps tell users to PIP_DISABLE_CONFIG=1 for a truly clean config-free reproducer.

IMO you've made some good arguments that your use case is worth considering, but your arguments are not strong enough to switch the default.

Thank you for your insight and pointing out where I should focus the discussion.

But the real problem is that both are valid requirements. As a result, we're not discussing what the behaviour should be, but rather which behaviour should be default, and which should be opt-in.

If you want to continue arguing for your behaviour being the default, I suggest you focus on how to address the backward compatibility issue.

My proposal to read sys.base_prefix / pip.conf IFF sys.prefix doesn't exist is the easiest way to enable both behaviours conveniently, but it is indeed backwards incompatible.

The problem is that the two use cases are in conflict - so we need to be able to configure which behaviour we want. But in order to configure this using non-global config, we need to read the base_prefix's config...

Backward compatibility is a significant weight in favour of ignoring the site config by default.

The previous statement I made is the strongest I have in favour of changing the default. If we change the default, it is easy to then make an isolated virtual environment (touch sys.prefix / pip.conf) if you need it. The existing pip config debug is clear and continues to apply. If we don't change the default then it requires some more complexity in pip to decide if it should consider sys.base_prefix / pip.conf.

So let's take a look at what that might actually entail if we don't change the default:

We need some non-global, non-venv and non-envvar means to tell pip to blend sys.base_prefix / pip.conf with sys.prefix / pip.conf. So we could perhaps look at sys.base_prefix / pip.conf and look for a config item called, say use-base-pip-conf: true (default to false). If it is set, we include the rest of sys.base_prefix / pip.conf when building the pip configuration.

It isn't the nicest behaviour because we need to pre-read the config to figure if we want to read the rest of the config. pip config debug needs to become more nuanced ("the base config file exists, but we don't use it because it is not enabled"). This is the cost of keeping the "isolated by default" behaviour though. I can imagine this feature leading to a bit of confusion, if I'm honest.

Do you have different ideas about how we might be able to satisfy the two use cases without changing the default? Fundamentally I'm struggling because the "isolation" behaviour a subset of the "non-isolation" one (i.e. you can configure the "non-isolated by default" one to be isolated, but you can't configure the "isolated by default" one to be non-isolated).

pelson avatar Apr 12 '21 06:04 pelson

If we can step away from discussing the implementation for a minute, what's not been made clear through the course of this conversation is that this change would only be applicable to Conda environments, because Conda apparently mangles the value of sys.base_prefix of a Python installed in a Conda environment. This means that you can keep a development environment-local pip configuration in the "global" Python prefix, a peculiarity only found in Conda environments.

layday avatar Apr 12 '21 11:04 layday

@layday I'm not familiar with Conda but the problem seems perfectly valid in the realm of the standard Python. If I understand correctly the original issue was "Isolated venv does not inherit PIP index " (https://github.com/pypa/build/issues/270). This has nothing to do with Conda. Actually I am surprised this issue has not come up way earlier. Just think of all these corporate environments when one is required to use specific pip index. It is natural to set this pip index as part of the base Python installation (often on a machine shared by many users) and hope it would propagate to virtual environments based on the base installation. I am in favour of reevaluating what the isolation of virtual environment is supposed to mean. I guess the original intend was to isolate virtual environment from packages installed in the base installation. We should consider if isolation from the base installation's configuration was kind of an accident making more harm than good.

piotr-dobrogost avatar Apr 12 '21 15:04 piotr-dobrogost

I'd question whether this is a valid configuration vector outside of a Conda environment. The intention with loading configuration from {sys.prefix}/pip.conf was to allow configuring pip in a virtual environment (and extended to support Conda environments in #6268), i.e. from a development-local config file. Corporate entities can configure pip globally instead of on a Python prefix basis - sys.base_prefix can be shared by but it can also vary (!) for multiple Python installations on the same system. What this means in practice is that you can ensure neither that a pip config file located in sys.base_prefix applies to only one specific Python installation nor that it is global.

layday avatar Apr 13 '21 01:04 layday

what's not been made clear through the course of this conversation is that this change would only be applicable to Conda environments

I don't think this conversation has anything to do with conda. Here is a clean build of Python:

$ wget https://www.python.org/ftp/python/3.9.4/Python-3.9.4.tgz
$ tar xzf Python-3.9.4.tgz 
$ ./Python-3.9.4/configure --prefix $(pwd)/py39
$ make
$ make install

With this:

$ ./py39/bin/python3 -m venv ./my-venv

$ ./py39/bin/python3
>>> import sys
>>> sys.base_prefix
'/home/pelson/Downloads/cpython_clean/py39'
>>> sys.prefix
'/home/pelson/Downloads/cpython_clean/py39'

$ ./my-venv/bin/python
>>> import sys
>>> sys.base_prefix
'/home/pelson/Downloads/cpython_clean/py39'
>>> sys.prefix
'/home/pelson/Downloads/cpython_clean/my-venv'

And the pip config:

$ touch ./py39/pip.conf
$ ./py39/bin/pip3 config debug
env_var:
env:
global:
  /etc/xdg/xdg-ubuntu/pip/pip.conf, exists: False
  /etc/xdg/pip/pip.conf, exists: False
  /etc/pip.conf, exists: True
    global.no-cache-dir: true
site:
  /home/pelson/Downloads/cpython_clean/py39/pip.conf, exists: True
user:
  /home/pelson/.pip/pip.conf, exists: False
  /home/pelson/.config/pip/pip.conf, exists: False

$ ./my-venv/bin/pip config debug
env_var:
env:
global:
  /etc/xdg/xdg-ubuntu/pip/pip.conf, exists: False
  /etc/xdg/pip/pip.conf, exists: False
  /etc/pip.conf, exists: True
    global.no-cache-dir: true
site:
  /home/pelson/Downloads/cpython_clean/my-venv/pip.conf, exists: False
user:
  /home/pelson/.pip/pip.conf, exists: False
  /home/pelson/.config/pip/pip.conf, exists: False

To "step away from the implementation", as you say, I want the pip config for the base environment to apply to the virtual environment one. sys.base_prefix is the documented way to identify "base environment" (https://docs.python.org/3/library/sys.html#sys.base_prefix).

Perhaps I missed something in what you were saying though.


To keep the conversation on track, the 3 proposals that have so far been discussed:

  1. pip always looks at sys.base_prefix/pip.conf and blends it together with sys.prefix/pip.conf if they exist
  2. pip looks at sys.base_prefix if-and-only-if sys.prefix/pip.conf doesn't exist
  3. pip looks at sys.base_prefix/pip.conf to determine if it should consider the rest of sys.base_prefix/pip.conf to be blended with sys.prefix/pip.conf. By default this would be false so the current default behaviour would remain.

My preference is option 2 as it represents a far simpler implementation and is far less confusing to explain, and therefore less likely to generate support requests to pip. Unfortunately this is a breaking change as pointed out by @pfmoore. To get back to the old behaviour, one just needs to touch sys.prefix/pip.conf (or remove the sys.base_prefix/pip.conf!).

As far as I can see right now there is no other way for us to use non-global, non-user, non-environmental means to allow a venv to be configured out of the box - I have an open question to hear of other viable implementation options. To re-iterate, for the use case it is essential that a venv can be configured out of the box (e.g. isolated wheel building), it is completely acceptable that the base environment needs to be configured to do this (i.e. it isn't the default behaviour) but is not OK for it to be a post-venv creation step as this would preclude the use of venv using tools.

pelson avatar Apr 13 '21 12:04 pelson

It is relevant to Conda because only with Conda will you have a different base prefix for every development environment. Without this precondition there is no use case for base prefix dependent configuration as explained above.

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Tuesday, 13 April 2021 15:06, Phil Elson @.***> wrote:

what's not been made clear through the course of this conversation is that this change would only be applicable to Conda environments

I don't think this conversation has anything to do with conda. Here is a clean build of Python:

$ wget https://www.python.org/ftp/python/3.9.4/Python-3.9.4.tgz $ tar xzf Python-3.9.4.tgz $ ./Python-3.9.4/configure --prefix $(pwd)/py39 $ make $ make install

With this:

$ ./py39/bin/python3 -m venv ./my-venv

$ ./py39/bin/python3

import sys sys.base_prefix '/home/pelson/Downloads/cpython_clean/py39' sys.prefix '/home/pelson/Downloads/cpython_clean/py39'

$ ./my-venv/bin/python

import sys sys.base_prefix '/home/pelson/Downloads/cpython_clean/py39' sys.prefix '/home/pelson/Downloads/cpython_clean/my-venv'

And the pip config:

$ touch ./py39/pip.conf $ ./py39/bin/pip3 config debug env_var: env: global: /etc/xdg/xdg-ubuntu/pip/pip.conf, exists: False /etc/xdg/pip/pip.conf, exists: False /etc/pip.conf, exists: True global.no-cache-dir: true site: /home/pelson/Downloads/cpython_clean/py39/pip.conf, exists: True user: /home/pelson/.pip/pip.conf, exists: False /home/pelson/.config/pip/pip.conf, exists: False

$ ./my-venv/bin/pip config debug env_var: env: global: /etc/xdg/xdg-ubuntu/pip/pip.conf, exists: False /etc/xdg/pip/pip.conf, exists: False /etc/pip.conf, exists: True global.no-cache-dir: true site: /home/pelson/Downloads/cpython_clean/my-venv/pip.conf, exists: False user: /home/pelson/.pip/pip.conf, exists: False /home/pelson/.config/pip/pip.conf, exists: False

To "step away from the implementation", as you say, I want the pip config for the base environment to apply to the virtual environment one. sys.base_prefix is the documented way to identify "base environment" (https://docs.python.org/3/library/sys.html#sys.base_prefix).

Perhaps I missed something in what you were saying though.


To keep the conversation on track, the 3 proposals that have so far been discussed:

  • pip always looks at sys.base_prefix/pip.conf and blends it together with sys.prefix/pip.conf if they exist
  • pip looks at sys.base_prefix if-and-only-if sys.prefix/pip.conf doesn't exist
  • pip looks at sys.base_prefix/pip.conf to determine if it should consider the rest of sys.base_prefix/pip.conf to be blended with sys.prefix/pip.conf. By default this would be false so the current default behaviour would remain.

My preference is option 2 as it represents a far simpler implementation and is far less confusing to explain, and therefore less likely to generate support requests to pip. Unfortunately this is a breaking change as pointed out by @.***(https://github.com/pfmoore). To get back to the old behaviour, one just needs to touch sys.prefix/pip.conf (or remove the sys.base_prefix/pip.conf!).

As far as I can see right now there is no other way for us to use non-global, non-user, non-environmental means to allow a venv to be configured out of the box - I have an open question to hear of other viable implementation options. To re-iterate, for the use case it is essential that a venv can be configured out of the box (e.g. isolated wheel building), it is completely acceptable that the base environment needs to be configured to do this (i.e. it isn't the default behaviour) but is not OK for it to be a post-venv creation step as this would preclude the use of venv using tools.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

layday avatar Apr 13 '21 12:04 layday

It is relevant to Conda because only with Conda will you have a different base prefix for every development environment.

No one is talking about different base prefix for every environment. The point is to have all virtual environments based on one, specific base Python installation to inherit pip configuration from this one, specific base Python installation. Also no one besides you mentioned Conda in any way.

Corporate entities can configure pip globally instead of on a Python prefix basis - sys.base_prefix can be shared by but it can also vary (!) for multiple Python installations on the same system.

That's exactly what OP stated:

There are other Python distributions on the controlled network (both accessible from the same machine) for which I am not responsible, and which need specific pip configurations.

As to:

What this means in practice is that you can ensure neither that a pip config file located in sys.base_prefix applies to only one specific Python installation nor that it is global.

Exactly the opposite is true. sys.base_prefix is unique per Python installation thus taking its pip's configuration into consideration in virtual environments allows all such environments to share common configuration tailored to the needs of this specific base Python installation.

piotr-dobrogost avatar Apr 13 '21 13:04 piotr-dobrogost

To keep the conversation on track, the 3 proposals that have so far been discussed

... and to provide context, pip's current behaviour

  • pip only looks at sys.prefix and never looks at sys.base_prefix.

This boils down to, pip looks at the currently active environment, the site configuration, and the user configuration. This is also simple to explain.

Do you have any examples of other software that looks up configuration from both sys.prefix and sys.base_prefix? I'm not aware of any.

🤷 I guess what it boils down to is that for me, the change is more difficult to describe than the current behaviour, it's not backward compatible, and it doesn't seem useful enough to justify the maintenance cost (without trying to be dismissive, your situation is clearly a fairly rare special case).

pfmoore avatar Apr 13 '21 13:04 pfmoore

Conda is mentioned in the bug report. I’ve explained why inheriting from a base prefix is problematic in other contexts.

I’m going to drop out of this conversation now since we’re going round in circles and people are starting to forget their manners.

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Tuesday, 13 April 2021 16:12, Piotr Dobrogost @.***> wrote:

It is relevant to Conda because only with Conda will you have a different base prefix for every development environment.

No one is talking about different base prefix for every environment. The point is to have all virtual environments based on one, specific base Python installation to inherit pip configuration from this one, specific base Python installation. Also no one besides you mentioned Conda in any way.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

layday avatar Apr 13 '21 13:04 layday

your situation is clearly a fairly rare special case

I think we might miss a chance to make important improvement for many use cases if we keep treating this as a rare special case. For one, there's nothing particularly special in having more than one base Python installation on one machine (especially in multi-user environment). The other thing is that I think we should see this issue as the one surfacing rather common problem of not having a way to transparently pre-configure virtual environments so that their users could easily obtain working environment without following additional, tedious configuration steps which otherwise could be automatised and hidden. I guess people got used to how things are in this regard and instead of looking to improve the overall situation like this issue tries are just creating and following instructions which are necessary to get newly created virtual environment to be actually useful for anything.

piotr-dobrogost avatar Apr 13 '21 14:04 piotr-dobrogost

I think we might miss a chance to make important improvement for many use cases if we keep treating this as a rare special case.

Can you give examples of other use cases?

For one, there's nothing particularly special in having more than one base Python installation on one machine (especially in multi-user environment).

But the special case here isn't just about having more than one base Python installation. It's about having multiple installations which must have unique pip configurations. And about wanting virtual environments to inherit that configuration, which is demonstrably rare, because it's not pip's current behaviour (if everyone wanted inheritance, why has no-one raised this before?)

The other thing is that I think we should see this issue as the one surfacing rather common problem of not having a way to transparently pre-configure virtual environments so that their users could easily obtain working environment without following additional, tedious configuration steps which otherwise could be automatised and hidden.

Agreed, but that's an issue with virtual environments, not with pip. Surely you're just as likely to want to pre-configure your virtual environment for other tools as well? And as has already been noted, virtualenv has this facility with its plugin feature, so this is only a problem for the stdlib venv.

I guess people used to how things are in this regard

Possibly. Or possibly (most) people don't have a problem with the current situation. We currently have no way of knowing. Which is why I'd like to see more evidence that this is a general problem before changing pip's default behaviour.

pfmoore avatar Apr 13 '21 15:04 pfmoore

This boils down to, pip looks at the currently active environment, the site configuration, and the user configuration. This is also simple to explain.

It might sound simple, but even the term site is ambiguous here (without even considering the literal meaning of "site"). You could mean the site module (std lib) which comes from the sys.base_prefix, or the site-packages directory which comes from the sys.prefix, or if the venv is created with --system-site-packages you could mean both sys.prefix and sys.base_prefix.

Correction: I've just re-read your statement, and now understand that the thing you call "site configuration" is what pip calls "global" configuration, and what you called "currently active environment" is actually the "site" config (in pip config terms).

My point here is not to nit-pick your words, but that "its simple to explain" isn't particularly true (there is a huge amount of detail in the global and user configuration), and it certainly isn't significantly worse if we have to include the following statement in the documentation: "for site config pip reads sys.prefix / pip.conf if it exists, and if not will fall back to sys.base_prefix / pip.confg if that exists, otherwise no site config will be used".


As far as I can see what remains of the discussion comes down to a judgement between the relative merits of one default behaviour vs another. As a reminder the major competing use cases:

  • A user of a multi-tenant machine (e.g. a supercomputer or an organisation's internal cluster), with multiple Python distributions requiring distinct pip configs (e.g. because of a different index url based on risk profile / python version) which are non global, non user, non environmental, and where python -m venv is expected to result in a working pip in the venv

vs

  • A user who wants to test pip with no site config by creating a venv in order to quickly check that a certain pip behaviour is not a result of a failed config (but global, user and envvar config remain)
  • A developer of a tool who expects pip in a clean venv to be entirely isolated from the parent environment so that pip behaves predictably and consistently (but global, user and envvar config remain)

(note: I've genuinely tried to represent the use cases fairly, but given what I've written, I seem to have found it difficult. Please feel free to add or refine the use cases.)

I don't think I can provide any more information to help on this judgement without risking going around in circles, and we're already quite a long way into this conversation. If it is decided not to allow venvs to inherit the base config out-of-the box then I would consider it unfortunate for organisations such as universities and national labs, but ultimately perhaps they have the resources to work around it (by hacking pip in some way, or by enforcing virtualenv instead of venv and using its plugin system). Of course, if it is decided that my proposal is indeed a reasonable next step then I would be more than happy to polish the implementation in #9753.

If there are viable proposals for not changing the default, but making the behaviour configurable (non global, non user, non environmental) I'd also be happy to hear them.

Thanks to all for your time so far :+1:

pelson avatar Apr 13 '21 20:04 pelson

A user of a multi-tenant machine (e.g. a supercomputer or an organisation's internal cluster), with multiple Python distributions requiring distinct pip configs

I realise now that I've not even included my own setup here. In my case I provide a Python distribution on a shared network mounted disk (NFS in this case) such that a diverse set of (Linux) machines and users can have the same Python distribution. In my scenario I literally cannot control the global configuration (I don't get mounted at /etc) and there are hundreds of users who I don't control and can't enforce user configuration.

pelson avatar Apr 17 '21 10:04 pelson

I've just re-read your statement, and now understand that the thing you call "site configuration" is what pip calls "global" configuration

I just checked the code here, and the output of pip config debug, to confirm I'm not confusing things here. Global and site configuration are different and site configuration is held in sys.prefix. It looks like site config isn't mentioned in the documentation, which is possibly why we're having such a hard time understanding each other, but site configuration is a real thing.

It's still true that site configuration isn't inherited by virtual environments (because they have a different sys.prefix) but that's deliberate, as we've said.

pfmoore avatar Apr 17 '21 11:04 pfmoore

I'll encourage folks to use the vocabulary established in https://pip--9474.org.readthedocs.build/en/9474/explanations/configuration/ if you're talking about pip configuration files to avoid confusion here. (that's from a documentation rewrite for pip I'm doing)

pradyunsg avatar Apr 17 '21 14:04 pradyunsg

It might be worth clarifying in those docs that the site location is {sys.prefix}/pip.ini, and applies to all environments, not just virtualenvs. The existing docs have the same problem.

pfmoore avatar Apr 17 '21 14:04 pfmoore

I appreciate all of the effort that has gone into this issue so far. Unfortunately I think it has stalled a bit, possibly as a result of a switch of focus mid-discussion.

The key messages in https://github.com/pypa/pip/issues/9752#issuecomment-817510909 and https://github.com/pypa/pip/issues/9752#issuecomment-819025200 are:

  • The "isolation for bug triage" use-case is better served by telling people to set PIP_DISABLE_CONFIG=1, otherwise you have to worry about the many other ways of configuring pip. In effect, the "isolation" is only isolation from "site" config, you still get the config from everywhere else - IMO it is highly unlikely that people are trying to isolate from only the site config.

  • The change is indeed backwards incompatible. To achieve comparable behaviour to today, a proposal was made to prevent sys.base_prefix and sys.prefix config merging, thereby meaning that touch sys.prefix/pip.conf would be sufficient to restore the fully isolated behaviour.

    • This wasn't my original proposal (I was effectively advocating a 4th layer in https://pip.pypa.io/en/stable/topics/configuration/?highlight=configuration#configuration-files), and the proposal is actually easier to document than originally: The site configuration is taken from $VIRTUAL_ENV/pip.conf. If this file is not found and $BASE_ENV/pip.conf exists, the configuration will be taken from there.

    • There were no other palatable alternative proposals. I continue to be open to anything that viably solves the use case using venv without changing the default behaviour (I have workarounds, so am looking for proposals that can be documented in the pip docs).

pelson avatar Feb 17 '22 08:02 pelson

I'm not sure it's stalled, so much as run its course.

My position is that this is absolutely a feature request, not a bug. And personally, I don't see a sufficiently good case having been made for the feature, so I am -1 on including it.

If someone wants to push for it, I think they'll need to do 2 things:

  1. Produce a PR, including documentation, so that we have concrete evidence for how much additional complexity this will add to pip, both in terms of code and in terms of describing pip's behaviour.
  2. Successfully argue the case that the benefits are sufficient to justify the additional complexity.

However, I would caution anyone investing time in the above, it's by no means a foregone conclusion that if you do the work, it will get accepted. At the moment, the only evidence we have is from one user, the OP, whose workflow would benefit from this (there have been other people, including me, expressing interest in the idea, but as far as I can tell that's all been theoretical, without actual use cases). We'd need the feature to be much more broadly useful (either in terms of examples of other people who would use this, or other reported issues that could be handled with this feature) if it's to be justified.

pfmoore avatar Feb 17 '22 09:02 pfmoore

At the moment, the only evidence we have is from one user, the OP

I currently represent 400+ scientific and engineering Python users, and have quite extensive experience of deploying Python to large scientific organisations who regularly have network isolated (internet-free) machines with internal package indexes (HPCs, critical op infrastructure, etc.). I'm not saying I'm correct or my opinion is worth more as a result, but I'm genuinely doing this in the interest of a significant user population who you would otherwise never see as they are behind an organisational firewall.

  1. Produce a PR, including documentation

The PR I produced offered concrete evidence (https://github.com/pypa/pip/pull/9753). It is stale, but remains a useful metric of the additional complexity. Essentially, it doesn't introduce any new functions as there is already behaviour to pick from the first found config file from a list of files.

I probably need to update the docs to be consistent with the terminology refined by @pradyunsg, which I can do if there is a hint of my proposal being accepted.

  1. Successfully argue the case that the benefits are sufficient to justify the additional complexity.

I personally think the complexity is low (please feel free to leave comments on the complexity in the PR if you disagree).

The benefits are clearly harder to be convincing about (I've tried, and either I'm not being clear enough, or you've understood the use case and don't think it is a useful one). I have tried quite hard to explain the use case in this issue, but will reiterate the example as succinctly as possible:

As a maintainer of a network mounted Python distribution for hundreds of scientific users in an isolated network environment, I want my users to be able to create virtual environments on their own machines (via venv, virtualenv, or some other method) which inherit the pip configuration from the base environment, so that my users can correctly pip install subsequent packages from the pre-configured package index into their virtual environment. It should not rely on env-vars (unreliable), and must not rely on a file existing in a user's homespace or on their machine's local dist (e.g. in /etc/).

If this use-case is understood and is not going to be supported, I suggest we close the issue. Otherwise, I will be happy to revive the simple PR in order to see it over the line. Furthermore, I will be happy to make further contributions to improve docs around the config in particular, if that is welcome/desired. I have a long-term interest in the feature being maintained, and would happily be pinged in order to address/resolve any future complications/requests around the config topic.

pelson avatar Sep 07 '22 07:09 pelson