pip
pip copied to clipboard
Should pip uninstall before updating dependencies?
pip version
21.0.1
Python version
3.9.0
OS
Linux
Additional information
No response
Description
When upgrading a wheel, pip seems to do...
- upgrades any dependencies (including installing new dependencies)
- uninstalls the old version
- installs the new version.
Expected behavior
I would expect it to do things in this order...
- uninstalls the old version
- upgrades any dependencies
- installs the new version
How to Reproduce
No response
Output
No response
Code of Conduct
- [X] I agree to follow the PSF Code of Conduct
I have a package P v1.0. I then decided to split this into two separate packages, P v1.1 and D v1.0 where D is a dependency of P v1.1 and contains some of the files that were in P v1.0. I'll refer to those files as F.
If I have P v1.0 installed and then run pip install --upgrade P the following happens:
- D v1.0 is installed which overwrites the F files from P v1.0
- P v1.0 is uninstalled which removes the F files
- P v1.1 is installed.
This leaves and installation without any F files. The fix is then to run pip install --force-reinstall D.
If pip uninstalled P v1.0 before updating the dependencies then this wouldn't happen.
The logic that leads to this ordering of operation is, pip calculates what packages it needs to change to migrate the environment from the old state to the new, order them based on what depends on what, and perform each migration by 1. uninstalling the old version and 2. installing the new version. So if “upgrading the dependencies” (there’s really no such thing, it’s “uninstalling the old dependencies” and “installing the new dependencies”), the whole operation would change from a chain of uninstall-install pairs to a deep nexted uninstall-install contexts. To visualise this:
# Now
Uninstall D
Install D
Uninstall P
Install P
# Proposed
Uninstall D
Uninstall P
Install P
Install D
The problem with the proposed logic is the context would usually have a lot more levels, and as many probably know, CPython is not very good dealing with deep call stacks.
So a more implementable logic may be to keep the current behaviour, but add a rule to roll back previous uninstall-install operations if an uninstall-install operation fails. This rollback logic is already possible (pip already does the uninstall-install operations in atomic transactions), but we need to move where the transactions are committed from when the package successfully install to when all packages successfully install.
On 25/02/2021 12:05, Tzu-ping Chung wrote:
The logic that leads to this ordering of operation is, pip calculates what packages it needs to change to migrate the environment from the old state to the new, order them based on what depends on what, and perform each migration by 1. uninstalling the old version and 2. installing the new version. So if “upgrading the dependencies” (there’s really no such thing, it’s “uninstalling the old dependencies” and “installing the new dependencies”), the whole operation would change from a chain of uninstall-install pairs to a deep nexted uninstall-install contexts. To visualise this:
# Now Uninstall D Install D Uninstall P Install P # Proposed Uninstall D Uninstall P Install P Install DThe problem with the proposed logic is the context would usually have a lot more levels, and as many probably know, CPython is not very good dealing with deep call stacks.
That's an implementation detail that can be dealt with with a properly designed algorithm. It can't used as an excuse to not fix incorrect behaviour.
So a more implementable logic may be to keep the current behaviour, but add a rule to roll back previous uninstall-install operations if an uninstall-install operation fails. This rollback logic is already possible (pip already does the uninstall-install operations in atomic transactions), but we need to move where the transactions are committed from when the package successfully install to when all packages successfully install.
This would not fix the problem. Each individual uninstall and install succeeded. It was the fact that they were done in the wrong order that resulted in a broken installation.
It can't used as an excuse to not fix incorrect behaviour.
Well, if you're gonna be dismissive of others' explanations of how they view the current situation as an excuse, I doubt this discussion is going to result in a solution for anyone. Please be mindful when replying to not come across as dismissive of what others are saying.
I'm struggling to understand what problem this change would solve. Please provide an "real" example of a situation where the current approach of doing things.
Is this basically asking for #8119 or proposing a different behaviour/solution to it? Or something else entirely?
On 26/02/2021 11:02, Pradyun Gedam wrote:
It can't used as an excuse to not fix incorrect behaviour.
Well,if you're gonna be dismissive of others explanations of how they view the current situation as an excuse, I doubt this discussion is going to result in a solution for anyone. Please be mindful when replying to not come across as dismissive of what others are saying.
Please accept my apologies if I caused any offence. I had interpreted the response as a justification (as opposed to an explanation) of the current behaviour. It is still not clear to me if the issue is considered a bug or not.
IMO the current behaviour is a consequence of the existing approach. Two packages referring to the same files is at best an unusual situation (I'm close to calling it "unsupported"), and your approach of doing the sort of split you talk about via upgrades, rather than clearing up and reinstalling, sounds risky.
Either way, your situation sounds like a rather obscure edge case to me.
Changing the approach like you suggest would potentially have an impact on people who at the moment aren't interacting with us, presumably because the current approach works for them. So we have no means of assessing how many people such a change would break - which is why we're cautious.
Trying to understand what you're doing, and why your approach is necessary (as opposed to something different which does work with pip's current behaviour) is important so that we can at least make a stab at judging the trade-offs. It's not dismissing your situation, or justifying leaving you with a problem. It does, however, mean that we're asking you to put your problem in perspective against all of the other users of pip, and all of the other competing requests we're getting (some of which may conflict with what you're asking for). We need that perspective to properly decide whether this is something we would call a bug. At the moment it's more "undocumented, but possibly implied by existing documentation, or possibly the existing docs imply that it should work, it's hard to tell" 🙁
But suggesting that we wouldn't implement a "properly designed algorithm" or that we're making excuses, doesn't help much. I'm assuming that was born from a frustration with the fact that you have a fairly urgent problem, and it doesn't feel like you'll get a quick fix. Which is true - it's unlikely this is going to be fixed as an urgent priority, even if we do consider it as a bug, so it'll probably be waiting a while before there's a PR you could try out. But that's the nature of open source, I'm afraid.
FWIW, the overwriting files behaviour is also under consideration to be changed, at this time -- see #8119.
While this certainly is an edge case, I hit it too. Yay.
Two packages referring to the same files is at best an unusual situation (I'm close to calling it "unsupported"),
Hopefully my situation is a bit more sane?
I have a pypi package 'robotpy' that used to have a 'robotpy' package in it, but it didn't do anything because the robotpy pypi package primarily exists as a virtual package so users can install a bunch of packages without typing a lot.
This year we decided to add a CLI tool, and it made sense to tell users to do python -m robotpy, but it didn't make sense to put that in the meta package because ... well, it's a meta package. So I made robotpy-cli and stuck it in there, and removed it from the robotpy package which now has... nothing.
Anyways. Now when my users upgrade the meta package (which depends on the CLI package) they get __main__; ‘robotpy’ is a package and cannot be directly executed. Sad.
and your approach of doing the sort of split you talk about via upgrades, rather than clearing up and reinstalling, sounds risky.
In my case, my users are high school students upgrading from our release of a prior year, and every other time they've been able to do it via upgrade, so it seems reasonable that upgrade would continue to work. I have updated our installation instructions, but I'm still getting some bug reports about this issue because .. high school students.
I produced some locally buildable example projects that illustrate the issue if that's helpful, but it sounds like the 'what' is reasonably well understood.