pip
pip copied to clipboard
Adding pipdeptree's functionality to pip code base
What's the problem this feature will solve?
Hi.
I am the author and current maintainer of pipdeptree and I wanted to check if pypa org / pip's maintainers would be open to the idea of adding pipdeptree's functionality to pip itself.
The current use case of pipdeptree is to render a graph of dependencies, and for that it uses pip's code (yes, import pip
, I know).
The current state of pipdeptree borders on "unmaintained" because, (1) I am not finding time for open source work lately due to personal commitments and (2) python is not my primary language at work any more, so I am not up-to-date with the recent developments in the python/pip ecosystem.
The original use case of pipdeptree was to make tracking down of conflicting dependencies easy. I believe conflicting dependencies is no longer a problem on newer environments thanks to the new dependency resolver. But people seem to be still using it, either for older envs or for it's graph output.
One problem with pipdeptree since the beginning has been it's dependence on the internal functions of pip. So every time there's an internal refactor in pip, a "stable" version of pipdeptree can potentially break. I was fully aware of this right from the start but the script did solve a problem for me and others at that time (2015). As the project became more popular, I (with the help of contributors) continued to keep the code up-to-date with latest pip versions, and even backward compatible with older pip versions.
Describe the solution you'd like
If pip is interested in the graph output functionality of pipdeptree, I propose the code (under MIT license) can be moved inside pip and provided as a subcommand. This will ensure that,
- There are no breakages due to internal refactor (considering that pipdeptree would also be "internal" to the repo)
- The community can maintain it in a more timely manner
As I mentioned above, pipdeptree uses import pip
under the hood and the core functionality of constructing the dependency tree is decoupled from the CLI. So I believe it should be straightforward to move the code without many changes.
If this proposal gets approved, I can help with moving the code to pip in whatever way possible.
Alternative Solutions
Another option for me is to find a maintainer and transfer the ownership of the repo on github and pypi.org. However, the possibility of having to fix pipdeptree after pip releases remains.
Additional context
Something similar was proposed earlier (long ago)
- https://github.com/pypa/pip/pull/2329
- https://github.com/pypa/pip/issues/8077
Code of Conduct
- [X] I agree to follow the PSF Code of Conduct.
Hi @naiquevin,
My personal view on this (not talking for other pip maintainers) is that this is unlikely to happen in the short term. Indeed, the pip team is so small with regards to the number critical tasks that need to be done, that it is not the right time to increase the scope of pip with features that can be implemented outside of pip.
To help with this we recently added the pip inspect command which should, in principle, provide everything you need by calling pip as a subprocess using a supported and stable CLI interface, therefore reducing your maintenance burden. This should also make it easier for you to have pipdeptree run in a separate environment (so you can install it with pipx for instance).
If this approach sounds interesting to you, we'd very much value your input on the pip inspect JSON format.
I agree with @sbidoul here. In particular, if you don't have the time to support pipdeptree, who would support that code if it were moved into pip? The pip maintainers, while maybe not as lacking in time as you are, are unlikely to have the bandwidth to give pipdeptree the attention it needs.
It's likely that if you want pipdeptree to continue, your best approach would be to find one of the existing users who is interested enough to take on maintenance, either as a co-maintainer, or to inherit the project from you. As @sbidoul noted, the new pip inspect
command could then be used to reduce the support cost (by avoiding the need to track every pip release) and bring the project into a more sustainable state.
Best of luck!
Indeed, the pip team is so small with regards to the number critical tasks that need to be done, that it is not the right time to increase the scope of pip with features that can be implemented outside of pip.
@sbidoul Fair enough.
To help with this we recently added the pip inspect command which should, in principle, provide everything you need by calling pip as a subprocess using a supported and stable CLI interface, therefore reducing your maintenance burden. This should also make it easier for you to have pipdeptree run in a separate environment (so you can install it with pipx for instance).
@sbidoul Thanks for pointing me to pip inspect
. I wasn't aware of it. If the json interface is stable, I don't see why it can't be used to build a tool such as pipdeptree. It will surely reduce maintenance burden. Will explore this further.
In particular, if you don't have the time to support pipdeptree, who would support that code if it were moved into pip? The pip maintainers, while maybe not as lacking in time as you are, are unlikely to have the bandwidth to give pipdeptree the attention it needs.
@pfmoore My thought behind this proposal is that if pipdeptree code is part of pip repo itself, then any contributor who refactors pip internals could also take care of modifying the pipdeptree functions that call the refactored code. Example: suppose I, as a pip contributor, make a change to the pip._internal.operations.freeze.FrozenRequirement
class, and if I grep for all usages of the class in the repo, I'll find pipdeptree
and can modify it accordingly as part of the same effort. That way pipdeptree will always be compatible with pip code and doesn't need to be fixed reactively after a pip release. It'd be much better than maintaining it externally.
Of course, I could be wrong as I am really not familiar with how contribution to pip works. If adding pipdeptree code to pip means I as the original author will be expected to maintain it inside pip repo, then that's not what I'm proposing :-)
To be clear, given a chance I'd be happy to continue working on OSS, but I know it wouldn't work for me at present. No disrespect meant.
Thanks
on "pip inspect", having the option to avoid "Description" field may be nice, as some packages are a bit verbose
on "pip inspect", having the option to avoid "Description" field may be nice, as some packages are a bit verbose
Yes it can be verbose. OTOH, this format is meant for tools, not humans, and unless we hit practical parsing performance or memory issues, it may not be worth the additional complexity.
If adding pipdeptree code to pip means I as the original author will be expected to maintain it inside pip repo, then that's not what I'm proposing :-)
It doesn't, necessarily. I was mostly reiterating what @sbidoul said, that it's unlikely the pip devs would have the time to give this much support, and therefore if you're not able to either (which was what I assumed, but you hadn't specifically said that) then we'd end up just moving the unmaintained code into pip, for no real benefit.
from 'pip inspect' it's a bit tiedous to replace " and extra == 'test'" or " extra == 'test'" per a qualified extra requirement_branch 'test'...
@stonebig the packaging
library has everything to parse requires_dist
correctly. Each item is a Requirement, then you can use requirement.marker to evaluate if it is applicable to the environment
and extras.
ok, extra doesn't work as I hoped , because of 3 corner cases, all generated via poetry. parsing requires_dist by myself was a not good idea.
isort * extra == "pipfile_deprecated_finder" or extra == "requirements_deprecated_finder"
hypercorn * (platform_system != "Windows") and (extra == "uvloop")
fastapi * and ( python_version<3.7) * extra == "test" and ( python_version<'3.7')
I'm just forced to pass an environment in Marker.evaluate() as an 'extra' variable doesn't seem to work
@sbidoul now relying on 'pip inspect'. I hope you won't change the format too much, only adding things.
@sbidoul now relying on 'pip inspect'.. I hope you won't change the format too much, only adding things.
Cool. I've no intent to change unless we receive user feedback that require it.
the only problem of this 'via terminal' interface is that it may not translate to pyodide type of Distros. ... dreaming of a json "direct" output of the same thing, one day
(Sorry, this became more of an essay than I intended...)
I really do think this sort of "query installed package" type of utility would be far better handled by someone taking the plunge and writing a standalone library for it. Yes, it won't be easy to capture all the subtleties pip covers - there's a lot of history in pip's codebase. But that doesn't mean it's not achievable, and if everyone keeps doing nothing in the hope that one day, pip's maintainers will have enough free time to implement a supported importable API, then we're never going to get anywhere. (Hint - that's not likely to happen in the foreseeable future).
Start small. From a quick look at pipdeptree, the only pip functionality it uses consists of:
- Listing all the installed distributions. For most environments, this can be done via
importlib.metadata
. Add support for anything else when people ask for it (if there actually is anythingimportlib.metadata
fails to support...). - Providing a string representation of a distribution. As far as I can see, this is relatively straightforward. Support for PEP 610/660 would need a small amount of work, although handling legacy editable installs (that don't follow PEP 660) would be trickier, because they were never standardised. Feel free to check the pip sources for tricks on how to do this, though.
Once there's a library that does this much (which would support the pipdeptree use case), adding further functionality could be done on an "as required" basis.
I could probably knock up a (standardised cases only) prototype pretty quickly. But that's not the hard part - someone needs to take ownership of the library and be prepared to support it. And that's where all of this falls down, because it always comes back to "pip should include this" - which basically means "I want the pip maintainers to support it so I don't have to". (To be clear, someone saying "I don't want to be the one to build and support this" is perfectly acceptable. It's when people feel it's OK to expect others to do so, for free, that it starts to get problematic.)
And in case it's not clear enough yet, pip doesn't "include code to do this already". It includes code that does various bits of this, and it puts them together in a context that has a whole bunch of assumptions that a public API can't make, and produces results in a way that probably isn't appropriate for a public API. But making an API out of these bits is likely to be at least as much work as writing it from scratch.
although handling legacy editable installs (that don't follow PEP 660) would be trickier, because they were never standardised.
They're still visible as regular Python packages, with dist-info or egg-info, last I checked, so it should "just work".
Is there anything actionable here, on pip's end?
They're still visible as regular Python packages, with dist-info or egg-info, last I checked, so it should "just work".
In this context, I think the problem is that any new library would need to duplicate pip's logic for reading the "old-style" files to get the location of the project (as there's no direct-url.json
file). But as I said, that's just a bit of extra work.
Is there anything actionable here, on pip's end?
Nope.