pip Max Backtracking Option and print out current failure casues

What's the problem this feature will solve? When a user has a complex requirements set it's possible that the backtracking can take hours / days / years to find a solution.

In a large requirements list it can be unclear why pip is backtracking as the conflict may exist many layers deep that the user does not know about.

Adding a max backtracking option attempts to solve 2 use cases:

When a user is debugging they can set max backtracking to 0 and be able to manually inspect the failure causes and improve their requirements
When a user is paying for CPU time, e.g. in a cloud environment, they may prefer pip to fail early in backtracking rather than run for hours. In this case they could set say a "reasonable" max backtracking such as 100 or 1000

Describe the solution you'd like

Add a maximum backtrack count to pip CLI and pass to the Resolution object
Add a self._backtrack_count to Resolution object
When backtracking increment self._backtrack_count and check if exceeds the maximum backtrack count
If it exceeds the maximum backtrack count log an error message that this happened and raise raise ResolutionImpossible(causes) so the user can inspect the current error was causing the backtracking

Additional context

This requires adding to the pip CLI, updating resolvelib, adding many test cases, and updating the documentation. I currently don't have a strong enough understanding of pip's code base to implement all of this. But if no one else works on this I will try and eventually submit the relevant PRs.

Aug 29 '21 16:08 notatallshaw

In principle, I'm +1 on this. However, backtracking is done a lot in normal processing (as I imagine you are aware) and it will be extremely hard to give good advice on what would be a "reasonable" number to set the backtracking parameter to.

I think the hardest part of doing this is likely to be to document it in a way that will (a) help users not familiar with the backtracking algorithm when they hit complex resolution issues, and (b) improve the quality of issue reports we receive when users cannot work out what's going wrong.

Aug 29 '21 17:08 pfmoore

In principle, I'm +1 on this. However, backtracking is done a lot in normal processing (as I imagine you are aware) and it will be extremely hard to give good advice on what would be a "reasonable" number to set the backtracking parameter to.

I think the hardest part of doing this is likely to be to document it in a way that will (a) help users not familiar with the backtracking algorithm when they hit complex resolution issues, and (b) improve the quality of issue reports we receive when users cannot work out what's going wrong.

I agree, my thought here is it's "better than nothing" which is currently the situation for some users. For example in https://github.com/pypa/pip/issues/10373 pip is not helping the user that much, it backtracks seemingly forever with no useful messages for the user as to why. Personally I could spend an hour or 2 on the issue and inject print statements and breakpoints in to pips codebase and eventually figure out how to change the order of the requirements to something that will resolve quicker, but this doesn't help the general user.

I feel like particularly "--max-backtracks 0" or a "--no-backtrack" flag would be extremely useful for projects which ship very tight requirements such as Airflow, so they can put it in to their ci/cd pipeline and resolve the requirements file as issues come up. It may also help tools that want to build on top of pip but not want pip to do the resolution.

My idea to suggest "--max-backtracks" rather than just "--no-backtrack" was to give the users flexibility in their approach. It would be difficult for pip to suggest a "reasonable" number but a motivated enough user could figure out their own "reasonable" number e.g. they may determine for their particular project under normal circumstances there are less than N backtracks and if there are more than 10*N backtracks something has gone terribly wrong.

Another idea I had to improve information to users / quality of bug reports to pip is to log the failure causes when they change, similar to how raising ResolutionImpossible(causes) displays the final conflicts, this could also be logged each time the causes change (as not to overwhelm the user with 1000s of the same message). It looks like https://github.com/sarugaku/resolvelib/pull/81 and https://github.com/pypa/pip/pull/10258 makes a start on this.

Aug 29 '21 18:08 notatallshaw

There’s another issue on this somewhere and I mused about something similar, but I couldn’t find it now. There are several “strategies” to tweak for the resolution. Never backtracking is one (and definitely useful; the problem is mainly how to expose it in a sensible way). Others include:

Prefer currently installed version or latest available version (i.e. what --upgrade-strategy currently does)
Prefer the lowest possible version instead of the latest (called Minimal Version Selection which is useful for constructing a library’s compatibility strategy)

The three are orthogontal, so we would need a way to combine them into one option (say pip install --strategy=prefer-latest,only-most-preferred) to not confuse users with too many different options.

Sep 03 '21 14:09 uranusjr

I agree that including something that has the effect of no backtracking in the upgrade-strategy makes a lot of sense over adding yet another CLI option. And this perfectly covers use case 1.

Use case 2 though is still not covered, that is you have a limited amount of CPU time in which to let pip run and you would rather fail than spend more CPU time, Through your own infrastructure you can implement some kind of kill signal on to the pip process but currently when you do that you don't have anything useful from the logs as to why pip unexpectedly took so long backtracking.

Perhaps though https://github.com/sarugaku/resolvelib/pull/81 and https://github.com/pypa/pip/pull/10258 are sufficient to let the user know what went wrong in that time that caused pip to backtrack so much. I was meaning to test them against the list of known problematic requirements I have been working on but I realized I don't actually know how to test something that requires both resolvelib and pip to be updated. I'll try and take a look this weekend how pip's vendoring process works and see if I can replicate it locally.

Sep 03 '21 15:09 notatallshaw

Yeah, it does not cover use case 2. But personally IMO it’s lost cause to even try to do that. Installing any kind of requirement that does not do strict pinning (==) in unattended environments without explicitly capping the resource usage is a recipe to disaster in the first place. Perhaps we could do e.g. automatically detect CI environment variables and switch to a “CI mode” that refuse those inputs outright, or something like that, but I don’t think we can effectively rescue a user from backtracking after it happens (but I can very well be wrong so don’t let my opinion stop you).

Sep 03 '21 15:09 uranusjr

I just hit an issue where due to a bad git merge a lockfile had incompatible requirements. Pip was backtracking on several of my CI runners for 4+ hours.

@uranusjr Is backtracking in a correct lockfile always an indication of an error? I.e. does Pip ever backtracks when given a complete set of compatible requirements, where all of them have ==x.y.z specifiers?

If so, it would be nice to have something that makes pip fail with EC != 0 whenever the resolution backtracks, to have a safety valve for issues with bad lockfiles.

Jan 23 '24 14:01 MrMino

@MrMino do you have a reproducible example?

I have an open PR that significantly improves the speed of backtracking when ther are lots of possible causes: https://github.com/pypa/pip/pull/12459

Every example I've tried so far has sped up backtracking from hours to minutes, it would be good to see if your issue also would be solved by this.

Jan 23 '24 15:01 notatallshaw

@notatallshaw I don't think I can share one, sorry. Even if I could, the lockfile in question contains hundreds of packages that aren't available on PyPI, so I doubt this would be of any value to you.

Edit: I could however run your patch and check. Let me see if it's possible.

Jan 23 '24 15:01 MrMino

Is backtracking in a correct lockfile always an indication of an error? I.e. does Pip ever backtracks when given a complete set of compatible requirements, where all of them have ==x.y.z specifiers?

Pip has some known performance issues for very large pinned files, there is a different open PR which should significantly reduce the time spends in this scenario: https://github.com/pypa/pip/pull/12453

Jan 23 '24 15:01 notatallshaw

@notatallshaw I'm unable to reproduce my original issue on my local VM, so I can't tell if your patch speeds things up. I cannot install your version on my runner either, so I have no results. Sorry.

The root cause of my issue might be connected to the CI cache that I'm using, or a rate limit imposed by my index. Not sure.

Intuitively, judging by the description of your MR, I don't expect much difference in my case. I would expect resolvlib to backtrack until all possible combinations of conflicting dependencies are exhausted, I'm not sure how preferring one cause over another is helping in this case.

Jan 23 '24 15:01 MrMino

I would expect resolvlib to backtrack until all possible combinations of conflicting dependencies are exhausted, I'm not sure how preferring one cause over another is helping in this case.

The problem at the moment is that the backtrack can choose causes which don't really conflict. By preferring conflicts resolvelib can much more quickly prove that it's impossible to resolve.

But certainly, this won't help all situations where Pip can get stuck backtracking. It's why I ask for reproducible examples, unfortunately situations like yours a pretty common in that there's no pubic way to reproduce. But thanks for trying.

Jan 23 '24 16:01 notatallshaw

pip pip copied to clipboard

Max Backtracking Option and print out current failure casues

pip
pip copied to clipboard