pip
pip copied to clipboard
Add support for outputting a list of extras and their requirements.
- Pip version: 9.0.1
- Python version: 3.6
- Operating system: Ubuntu 17.10
Description:
I'm trying to write a script to automatically generate constraints files(with hashes) and I'd like a way to allow this script to detect any extras, and generate the constraints file for those too.
I have no preference to the command; but it might be nice to put it in pip show as another section(depending on compatibility)
What I've run:
$ pip show requests
Name: requests
Version: 2.18.4
...
Requires: idna, urllib3, chardet, certifi
ExtraRequires[security]: cryptography, pyOpenSSL
ExtraRequires[socks]: PySocks
An alternative would be to create a pip metadata <package> command to output the <...>.dist-info/METADATA or <...>.dist-info/metadata.json files so that they can be parsed by downstream tooling.
This seems to be deeply related and a better alternative to #3797.
An even better alternative would be to use importlib_metadata, which has an API.
>>> import importlib_metadata
>>> importlib_metadata.metadata('xonsh').get_all('Provides-Extra')
['linux', 'mac', 'proctitle', 'ptk', 'pygments', 'win']
>>> importlib_metadata.metadata('xonsh').get_all('Requires-Dist')
["distro; extra == 'linux'", "gnureadline; extra == 'mac'", "setproctitle; extra == 'proctitle'", "prompt-toolkit; extra == 'ptk'", "pygments (>=2.2); extra == 'pygments'", "win-unicode-console; extra == 'win'"]
And use packaging to parse them:
>>> req = next(map(packaging.requirements.Requirement, importlib_metadata('xonsh').get_all('Requires-Dist')))
>>> req.name
'distro'
>>> req.specifier
<SpecifierSet('')>
>>> req.extras
set()
>>> req.marker
<Marker('extra == "linux"')>
The lack of this feature is an increasingly troublesome annoyance for Twisted. Since python packaging is allegedly Good Now, Actually™ (thanks for that, by the way!), we're starting to rely more on asking our users to install e.g. twisted[tls] or twisted[all_non_platform] in order to get specific optional features, rather than testing huge and ancient dependency matrices so everything will work correctly with default packages on whatever unfortunate end-of-lifed Linux distribution our users happen to be installing on.
However, how is the user supposed to know if it's twisted[tls] or twisted[ssl]? Where do they find out about the string all_non_platform? It would be really great if we could not only get a list of extras, but also like, include some documentation with each one so that as someone is pip install'ing, it's clear to them what extras are available and what each one means.
Thanks for reading!
Given that, as @jaraco mentioned, the data is available via importlib_metadata and packaging, maybe someone could write and publish a standalone script to extract this data?
I'm sympathetic to the idea that making the information available via pip aids discoverability, but given that it's possible to get the information the OP needed already, it seems less than ideal to block any progress on this until it can be added to pip. (If nothing else, a standalone script could be the basis of a future PR for pip).
I'm sympathetic to the idea that making the information available via pip aids discoverability, but given that it's possible to get the information the OP needed already, it seems less than ideal to block any progress on this until it can be added to pip.
The OP posted the issue a year and a half ago. I hope they already found a way to not be blocked in their endeavours. :) This is the pip issue tracker though. It seems appropriate to focus on how to support this use-case in pip here.
The lack of this feature is an increasingly troublesome annoyance for Twisted. Since python packaging is allegedly Good Now, Actually™ (thanks for that, by the way!), we're starting to rely more on asking our users to install e.g.
twisted[tls]ortwisted[all_non_platform]in order to get specific optional features, rather than testing huge and ancient dependency matrices so everything will work correctly with default packages on whatever unfortunate end-of-lifed Linux distribution our users happen to be installing on.However, how is the user supposed to know if it's
twisted[tls]ortwisted[ssl]? Where do they find out about the stringall_non_platform? It would be really great if we could not only get a list of extras, but also like, include some documentation with each one so that as someone ispip install'ing, it's clear to them what extras are available and what each one means.Thanks for reading!
One thing to note is that the OP seems to have been discussing gathering this information for programmatic consumption ("I'm trying to write a script to automatically ...") but this comment from @glyph is firmly focused on user education and so probably manual usage by a human. These are both good and interesting use-cases but they may not have the same solution.
while it might be a slight different enhancement than what OP meant, I think we all would be glad to have a pip search-extras <package>, that will output the available extras for an (uninstalled) package.
pip is a tool that allows us to list (search) packages & than install them, it allows us to install extras, but how can we discover them?
I believe it's more similar to what @glyph wanted?
Using the
import importlib_metadata; importlib_metadata.metadata('twisted').get_all('Provides-Extra')
requires a venv with importlib_metadata installed, and running python code - not an easy interaction when you are just trying to remember a name of an extra to install.
Okay, this feature makes sense and we'd want to include it in pip. We welcome folks to submit a PR for implementing this functionality. Note that this does not mean that the said PR would be accepted - it would still be subject to our regular code reviews as with every other PR in pip.
The relevant bit of code is likely in src/pip/_internal/commands/show.py. Currently, pip uses pkg_resources and the main task is to figure out how to correctly compute the extras a package supports and getting the dependencies that are specified in that extra.
There are actually two features this thread talks about.
The top post describes showing showing extras in pip show. The functionality is simple to implement (as demostrated multiple times here), but the interface would require some thought. Personally I would prefer pip to ditch the current pip show formatand introduce something that shows the actual metadata (instead of a transformed version it currently outputs), which would greatly simplies things. Maybe a process like the previously happened pip list migration would be applicable here. If pip decides to stick with the current show format, it would be another problem to come up with a good format to display extras.
But that does not really help the discovery problem mentioned by @glyph, since show only works for installed packages. Currently there are essentially no ways packages could advertise their extras (except document them manually). The first solution that comes to mind is to show that on PyPI like Crates.io (at the bottom in the right-hand column). The problem though is that Python package requirements are declared per artifact, not per version, so thoeratically a package can declare different extras for its Windows and Linux wheels of the same version. This is a very weird things to do in almost all cases, but still it is a thing, and PyPI needs to accomodate that, making the feature difficult to design.
A more workable solution would be to let pip show work with non-installed wheels. The simplese addition would be to allow pip show [path-to-wheel] so people can combine it with pip download to achieve what they want. There’s another possible step pip can take to implement something like pip show --download [name], but that’s probably worth its own discussion.
I hope this thought dump would be marginally helpful for anyone interesting in implementing this feature.
Personally I would prefer pip to ditch the current pip show format and introduce something that shows the actual metadata (instead of a transformed version it currently outputs), which would greatly simplies things.
The main suggestions provided in https://github.com/pypa/pip/pull/8008#pullrequestreview-391133428 which is the documentation PR for https://github.com/pypa/pip/pull/7967 (Adding json format display to pip show) is to add a metadata key in the json output which will have a dictionary containing the fields in https://packaging.python.org/specifications/core-metadata/ . That should be able to take care of listing extras and their requirements atleast for installed packages .
+1 for listing the extras of a package
We want to split our package into smaller packages linked as extras and it would be nice to see what are all the available extras.
Also I have open a stackoverflow question about that before finding this thread
For anyone here looking for something to copy/paste:
from importlib.metadata import metadata
from collections import defaultdict
from pprint import pprint
def get_extras(package_name):
extras = metadata(package_name).get_all("Provides-Extra")
required_dists = metadata(package_name).get_all("Requires-Dist")
result = defaultdict(list)
for extra in extras:
for required_dist in required_dists:
suffix = f" ; extra == '{extra}'"
if required_dist.endswith(suffix):
result[extra].append(required_dist.replace(suffix, ""))
return result
pprint(get_extras("package_name_goes_here"))
This will output the extras, and the packages that each extra will install. E.g. for the transformers package:
{
"accelerate": ["accelerate (>=0.10.0)"],
"audio": ["librosa", "pyctcdecode (>=0.3.0)", "phonemizer"],
"codecarbon": ["codecarbon (==1.2.0)"],
"deepspeed": ["deepspeed (>=0.6.5)", "accelerate (>=0.10.0)"],
...
}
I have no idea how robust this is vis-a-vis the layout of the METADATA file, but it Works on My Machine™
@davidgilbertson You can use pip inspect (https://github.com/pypa/pip/pull/11245) with jq
Something like… pip inspect | jq '.installed[]|select(.metadata.name=="Twisted").metadata.provides_extra' ?
'pip inspect' is ok, and gives pypa devs full liberty to change/improve anything. An option 'from pip import inspect', to get the same full json string, would be even better, if it could remove the underlying command line legacy
It would be ideal if an implementation for #7122 had an error-message similar to: There is no extra "foo". This package has extras: "bar", "quux" and "zinga".
Here's a one-liner based on @stonebig's answer that uses jq as well:
$ python -m pip inspect | jq '.installed[] | select(.metadata.name == "scipy").metadata.provides_extra'
WARNING: pip inspect is currently an experimental command. The output format may change in a future release without prior warning.
[
"test",
"doc",
"dev"
]
Something like… pip inspect | jq '.installed[]|select(.metadata.name=="Twisted").metadata.provides_extra' ?
Here's a one-liner based on @stonebig's answer that uses
jqas well:$ python -m pip inspect | jq '.installed[] | select(.metadata.name == "scipy").metadata.provides_extra'
Built on top of @Glyph's and @kratsg's answers I came up with a jq query that filters out all the deps from required_dist and provides them as well:
pip inspect | jq ".installed[] | select(.metadata.name == \"aiogram\").metadata.requires_dist
| map(select(contains(\"; extra ==\")) | capture(\"(?<dep>.+); extra == (\\\"|')(?<extra>[^']+)(\\\"|')\" ) )
| group_by(.extra)
| map( { (.[0].extra) : map(.dep) } )
| add"
{
"dev": [
"mypy",
"typing_extensions",
"types-psutil",
"pycodestyle",
"ruff",
"cython-lint>=0.12.2",
"rich-click",
"click",
"doit>=0.36.0",
"pydevtool"
],
"doc": [
"sphinx!=4.1.0",
"pydata-sphinx-theme==0.9.0",
"sphinx-design>=0.2.0",
"matplotlib>2",
"numpydoc",
"jupytext",
"myst-nb",
"pooch"
],
"test": [
"pytest",
"pytest-cov",
"pytest-timeout",
"pytest-xdist",
"asv",
"mpmath",
"gmpy2",
"threadpoolctl",
"scikit-umfpack",
"pooch"
]
}
And if you want to put it in a script and are too lazy to add the necessary bits XD here you go:
#!/usr/bin/env sh
python -m pip inspect | jq ".installed[] | select(.metadata.name == \"$1\").metadata.requires_dist
| map(select(contains(\"; extra ==\")) | capture(\"(?<dep>.+); extra == (\\\"|')(?<extra>[^']+)(\\\"|')\" ) )
| group_by(.extra)
| map( { (.[0].extra) : map(.dep) } )
| add"
I have create a tiny Python gist to print out package extras' requirements: get_package_extras.py.
Do we still want a dedicated feature for this, now that pip inspect sort of covers this (and a lot more)?
@uranusjr Indeed, all information is available with pip inspect. But the problem is pip inspect returns a huge JSON structure, so if you want to have just a bit of information, like package extras and their requirements, you have to parse and filter what pip inspect returns. If there is a sufficiently great number of people who would find extracting a subset of pip inspect output and filtering it with some parameters useful, I believe implementing such a functionality will be of value.
You may close this old thread, and the idea may come back when a fundamental change will happen, like 'uv' integration
Do we still want a dedicated feature for this, now that pip inspect sort of covers this (and a lot more)?
I do.
Parsing hundreds of lines of JSON is "real work". The most-promising solution (thanks @mierzejk !) is 25 lines of python, not exactly fodder for a bash alias.
I'm here on this thread again in late 2024 because I've searched for how to do this, and been led here from a (now useless) StackOverflow answer. When it's faster to look in the source, understand setup.py and/or pyproject.toml I think this deserves a feature, personally.
I'd be way more willing to take a stab at a PR if a concrete goal was outlined here.
Like: what does the option or subcommand look like?
(For me personally, I'd really like to have something like what https://github.com/pypa/pip/issues/7122 asks for -- both an error and helpful message when installing and unknown extra -- but that thread also doesn't appear to have consensus. Since this has come up for me several times recently, I'd love to have some "pip developer / ux guidance" on what to do here, what might be most-acceptable: a command asking for valid extras, or an error-message like 7122 wants? Does anyone have a few more hints towards what pip developers or other users would find acceptable here?).
For the purposes of spurring discussion, what about:
pip list-extras example-package- this outputs any valid extras for
example-packageone line at a time (or nothing if there are no extras) - that is if both
pip install example-package[foo]andpip install example-package[bar]are valid, it would output:
$ pip list-extras example-package
bar
foo
For the purposes of spurring discussion, what about:
pip list-extras example-package this outputs any valid extras for example-package one line at a time (or nothing if there are no extras) that is if both pip install example-package[foo] and pip install example-package[bar] are valid, it would output:
$ pip list-extras example-package bar foo
This 100% what I'm looking for
For the purposes of spurring discussion, what about: pip list-extras example-package this outputs any valid extras for example-package one line at a time (or nothing if there are no extras) that is if both pip install example-package[foo] and pip install example-package[bar] are valid, it would output: $ pip list-extras example-package bar foo
This 100% what I'm looking for
FWIW this is equivalent:
$ pip list-extras apache-beam | jq -c '.installed.[].metadata | select(.name | contains("apache-beam")) | .provides_extra'
["docs","test","gcp","interactive","interactive-test","ml-test","p312-ml-test","aws","azure","dataframe","dask","yaml","torch","tensorflow","transformers","tft","onnx","xgboost","tensorflow-hub"]