Prefer upper bounds when resolving/backtracking
Fixes: https://github.com/pypa/pip/issues/12993 Fixes: https://github.com/pypa/pip/issues/12990 Fixes: https://github.com/pypa/pip/issues/12430 Fixes: https://github.com/pypa/pip/issues/13030
This PR is built on top of https://github.com/pypa/pip/pull/12982 so that the unit tests can be expanded, either that PR can be reviewed first, or this PR can supplant that PR.
I have developed some benchmark scripts to ensure that changes to pip's resolution algorithm don't regress common real world requirements: https://github.com/notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks.
I plan to keep building out more scenarios, you can see the current ones so far here: https://github.com/notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks/tree/main/scenarios
Upon testing this PR compared to pip 24.2 I see one small regressions and two big improvements:
Difference for scenario scenarios/problematic.toml - autogluon:
Success: False -> True.
Failure Reason: Build Failure -> None.
Difference for scenario scenarios/problematic.toml - boto3-urllib3-transient:
Number of packages processed: 869 -> 871
Difference for scenario scenarios/big-packages.toml - apache-airflow-all:
Number of requirements processed: 593 -> 592
Number of packages processed: 681 -> 661
The fact that autogluon can resolve is a big improvement, apache-airflow[all] gets a noticeable improvement in how many packages it has to process (and this has real time improvement, as the number of packages processed can have O(n^2) complexity) , and a scenario involving boto3 and urllib3 as transient requirements gets a small regression in having to process 2 more packages.
I am hoping to find more real world scenarios where this has a noticeable difference, but I think these results are sufficient to show this approach is a net positive.
Very tentatively adding this to the 24.3 milestone on the basis of:
- If a maintainer with resolver experience can look at https://github.com/pypa/pip/pull/12982 then this PR only adds a small amount of functional code on top: https://github.com/pypa/pip/pull/13017/commits/70f4d92e7e5505bb158992116980a9c46d0db6f0
- This expands the unit tests in that PR to the functional code in this PR
- This is backed up as not regressing against a number of scenarios
- It has a real world issue it fixes
But I understand if no maintainer is available to review.
Added more problematic scenarios in: https://github.com/notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks/blob/main/scenarios/problematic.toml
And found this also fixes https://github.com/pypa/pip/issues/12430 (which was merged into another issue, but the specific resolution the user had is now solved by this).
I do not know pip resiolution internals - but the rules explained make sense and might improve a number of cases indeed.
I took a look to see whether it made any difference to put upper bound preference above or below backtracking cause preference, and at least in the scenarios I currently have in https://github.com/notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks/blob/main/scenarios it didn't make any significant difference (there was a very slight regression of apache-airflow-beam putting it below, as it visited 1 extra package).
So I consider this good in its current position, and if I find a scenario in the future, or a user reports one, where it does make a significant difference, then it can be changed.
Found a minor improvement, in acryl-datahub[all] which has over 300 total dependencies, it visited 1 less requirement, 6 less packages, and produced a slightly better solution: https://github.com/notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks/pull/2#issuecomment-2425017792
While this looks very reasonable I'd prefer to have another resolver expert (which I am not, unfortunately) to look into this. So postponing.
While this looks very reasonable I'd prefer to have another resolver expert (which I am not, unfortunately) to look into this. So postponing.
I knew this one was pretty unlikely but I thought I'd give it a shot since the recent real world issues raised that this solves.
Going to make a single follow up PR once https://github.com/pypa/pip/pull/13001 lands, I'll comment here once done.