pip icon indicating copy to clipboard operation
pip copied to clipboard

Add support for outputting a list of extras and their requirements.

Open software-opal opened this issue 8 years ago • 26 comments

  • Pip version: 9.0.1
  • Python version: 3.6
  • Operating system: Ubuntu 17.10

Description:

I'm trying to write a script to automatically generate constraints files(with hashes) and I'd like a way to allow this script to detect any extras, and generate the constraints file for those too.

I have no preference to the command; but it might be nice to put it in pip show as another section(depending on compatibility)

What I've run:

$ pip show requests
Name: requests
Version: 2.18.4
...
Requires: idna, urllib3, chardet, certifi
ExtraRequires[security]: cryptography, pyOpenSSL
ExtraRequires[socks]: PySocks

software-opal avatar Oct 30 '17 01:10 software-opal

An alternative would be to create a pip metadata <package> command to output the <...>.dist-info/METADATA or <...>.dist-info/metadata.json files so that they can be parsed by downstream tooling.

software-opal avatar Oct 30 '17 01:10 software-opal

This seems to be deeply related and a better alternative to #3797.

pradyunsg avatar Nov 01 '17 19:11 pradyunsg

An even better alternative would be to use importlib_metadata, which has an API.

>>> import importlib_metadata
>>> importlib_metadata.metadata('xonsh').get_all('Provides-Extra')
['linux', 'mac', 'proctitle', 'ptk', 'pygments', 'win']
>>> importlib_metadata.metadata('xonsh').get_all('Requires-Dist')
["distro; extra == 'linux'", "gnureadline; extra == 'mac'", "setproctitle; extra == 'proctitle'", "prompt-toolkit; extra == 'ptk'", "pygments (>=2.2); extra == 'pygments'", "win-unicode-console; extra == 'win'"]

jaraco avatar Nov 01 '18 08:11 jaraco

And use packaging to parse them:

>>> req = next(map(packaging.requirements.Requirement, importlib_metadata('xonsh').get_all('Requires-Dist')))
>>> req.name
'distro'
>>> req.specifier
<SpecifierSet('')>
>>> req.extras
set()
>>> req.marker
<Marker('extra == "linux"')>

jaraco avatar Nov 01 '18 08:11 jaraco

The lack of this feature is an increasingly troublesome annoyance for Twisted. Since python packaging is allegedly Good Now, Actually™ (thanks for that, by the way!), we're starting to rely more on asking our users to install e.g. twisted[tls] or twisted[all_non_platform] in order to get specific optional features, rather than testing huge and ancient dependency matrices so everything will work correctly with default packages on whatever unfortunate end-of-lifed Linux distribution our users happen to be installing on.

However, how is the user supposed to know if it's twisted[tls] or twisted[ssl]? Where do they find out about the string all_non_platform? It would be really great if we could not only get a list of extras, but also like, include some documentation with each one so that as someone is pip install'ing, it's clear to them what extras are available and what each one means.

Thanks for reading!

glyph avatar Apr 23 '19 04:04 glyph

Given that, as @jaraco mentioned, the data is available via importlib_metadata and packaging, maybe someone could write and publish a standalone script to extract this data?

I'm sympathetic to the idea that making the information available via pip aids discoverability, but given that it's possible to get the information the OP needed already, it seems less than ideal to block any progress on this until it can be added to pip. (If nothing else, a standalone script could be the basis of a future PR for pip).

pfmoore avatar Apr 23 '19 08:04 pfmoore

I'm sympathetic to the idea that making the information available via pip aids discoverability, but given that it's possible to get the information the OP needed already, it seems less than ideal to block any progress on this until it can be added to pip.

The OP posted the issue a year and a half ago. I hope they already found a way to not be blocked in their endeavours. :) This is the pip issue tracker though. It seems appropriate to focus on how to support this use-case in pip here.

exarkun avatar Apr 25 '19 15:04 exarkun

The lack of this feature is an increasingly troublesome annoyance for Twisted. Since python packaging is allegedly Good Now, Actually™ (thanks for that, by the way!), we're starting to rely more on asking our users to install e.g. twisted[tls] or twisted[all_non_platform] in order to get specific optional features, rather than testing huge and ancient dependency matrices so everything will work correctly with default packages on whatever unfortunate end-of-lifed Linux distribution our users happen to be installing on.

However, how is the user supposed to know if it's twisted[tls] or twisted[ssl]? Where do they find out about the string all_non_platform? It would be really great if we could not only get a list of extras, but also like, include some documentation with each one so that as someone is pip install'ing, it's clear to them what extras are available and what each one means.

Thanks for reading!

One thing to note is that the OP seems to have been discussing gathering this information for programmatic consumption ("I'm trying to write a script to automatically ...") but this comment from @glyph is firmly focused on user education and so probably manual usage by a human. These are both good and interesting use-cases but they may not have the same solution.

exarkun avatar Apr 25 '19 15:04 exarkun

while it might be a slight different enhancement than what OP meant, I think we all would be glad to have a pip search-extras <package>, that will output the available extras for an (uninstalled) package. pip is a tool that allows us to list (search) packages & than install them, it allows us to install extras, but how can we discover them? I believe it's more similar to what @glyph wanted?

Using the import importlib_metadata; importlib_metadata.metadata('twisted').get_all('Provides-Extra') requires a venv with importlib_metadata installed, and running python code - not an easy interaction when you are just trying to remember a name of an extra to install.

tsvikas avatar Dec 17 '19 11:12 tsvikas

Okay, this feature makes sense and we'd want to include it in pip. We welcome folks to submit a PR for implementing this functionality. Note that this does not mean that the said PR would be accepted - it would still be subject to our regular code reviews as with every other PR in pip.

The relevant bit of code is likely in src/pip/_internal/commands/show.py. Currently, pip uses pkg_resources and the main task is to figure out how to correctly compute the extras a package supports and getting the dependencies that are specified in that extra.

pradyunsg avatar Feb 06 '20 06:02 pradyunsg

There are actually two features this thread talks about.

The top post describes showing showing extras in pip show. The functionality is simple to implement (as demostrated multiple times here), but the interface would require some thought. Personally I would prefer pip to ditch the current pip show formatand introduce something that shows the actual metadata (instead of a transformed version it currently outputs), which would greatly simplies things. Maybe a process like the previously happened pip list migration would be applicable here. If pip decides to stick with the current show format, it would be another problem to come up with a good format to display extras.

But that does not really help the discovery problem mentioned by @glyph, since show only works for installed packages. Currently there are essentially no ways packages could advertise their extras (except document them manually). The first solution that comes to mind is to show that on PyPI like Crates.io (at the bottom in the right-hand column). The problem though is that Python package requirements are declared per artifact, not per version, so thoeratically a package can declare different extras for its Windows and Linux wheels of the same version. This is a very weird things to do in almost all cases, but still it is a thing, and PyPI needs to accomodate that, making the feature difficult to design.

A more workable solution would be to let pip show work with non-installed wheels. The simplese addition would be to allow pip show [path-to-wheel] so people can combine it with pip download to achieve what they want. There’s another possible step pip can take to implement something like pip show --download [name], but that’s probably worth its own discussion.

I hope this thought dump would be marginally helpful for anyone interesting in implementing this feature.

uranusjr avatar Feb 06 '20 17:02 uranusjr

Personally I would prefer pip to ditch the current pip show format and introduce something that shows the actual metadata (instead of a transformed version it currently outputs), which would greatly simplies things.

The main suggestions provided in https://github.com/pypa/pip/pull/8008#pullrequestreview-391133428 which is the documentation PR for https://github.com/pypa/pip/pull/7967 (Adding json format display to pip show) is to add a metadata key in the json output which will have a dictionary containing the fields in https://packaging.python.org/specifications/core-metadata/ . That should be able to take care of listing extras and their requirements atleast for installed packages .

deveshks avatar May 05 '20 19:05 deveshks

+1 for listing the extras of a package

We want to split our package into smaller packages linked as extras and it would be nice to see what are all the available extras.

Also I have open a stackoverflow question about that before finding this thread

fabiencelier avatar Aug 27 '20 09:08 fabiencelier

For anyone here looking for something to copy/paste:

from importlib.metadata import metadata
from collections import defaultdict
from pprint import pprint


def get_extras(package_name):
    extras = metadata(package_name).get_all("Provides-Extra")
    required_dists = metadata(package_name).get_all("Requires-Dist")

    result = defaultdict(list)
    for extra in extras:
        for required_dist in required_dists:
            suffix = f" ; extra == '{extra}'"
            if required_dist.endswith(suffix):
                result[extra].append(required_dist.replace(suffix, ""))

    return result


pprint(get_extras("package_name_goes_here"))

This will output the extras, and the packages that each extra will install. E.g. for the transformers package:

{
    "accelerate": ["accelerate (>=0.10.0)"],
    "audio": ["librosa", "pyctcdecode (>=0.3.0)", "phonemizer"],
    "codecarbon": ["codecarbon (==1.2.0)"],
    "deepspeed": ["deepspeed (>=0.6.5)", "accelerate (>=0.10.0)"],
	...
}

I have no idea how robust this is vis-a-vis the layout of the METADATA file, but it Works on My Machine™

davidgilbertson avatar Nov 01 '22 06:11 davidgilbertson

@davidgilbertson You can use pip inspect (https://github.com/pypa/pip/pull/11245) with jq

q0w avatar Nov 01 '22 07:11 q0w

Something like… pip inspect | jq '.installed[]|select(.metadata.name=="Twisted").metadata.provides_extra' ?

glyph avatar Nov 01 '22 08:11 glyph

'pip inspect' is ok, and gives pypa devs full liberty to change/improve anything. An option 'from pip import inspect', to get the same full json string, would be even better, if it could remove the underlying command line legacy

stonebig avatar May 21 '23 12:05 stonebig

It would be ideal if an implementation for #7122 had an error-message similar to: There is no extra "foo". This package has extras: "bar", "quux" and "zinga".

meejah avatar Oct 04 '23 19:10 meejah

Here's a one-liner based on @stonebig's answer that uses jq as well:

$ python -m pip inspect | jq '.installed[] | select(.metadata.name == "scipy").metadata.provides_extra'
WARNING: pip inspect is currently an experimental command. The output format may change in a future release without prior warning.
[
  "test",
  "doc",
  "dev"
]

kratsg avatar Oct 20 '23 18:10 kratsg

Something like… pip inspect | jq '.installed[]|select(.metadata.name=="Twisted").metadata.provides_extra' ?

Here's a one-liner based on @stonebig's answer that uses jq as well:

$ python -m pip inspect | jq '.installed[] | select(.metadata.name == "scipy").metadata.provides_extra'

Built on top of @Glyph's and @kratsg's answers I came up with a jq query that filters out all the deps from required_dist and provides them as well:

pip inspect | jq ".installed[] | select(.metadata.name == \"aiogram\").metadata.requires_dist
  | map(select(contains(\"; extra ==\")) | capture(\"(?<dep>.+); extra == (\\\"|')(?<extra>[^']+)(\\\"|')\" ) )
  | group_by(.extra)
  | map( { (.[0].extra) : map(.dep) } )
  | add"
{
  "dev": [
    "mypy",
    "typing_extensions",
    "types-psutil",
    "pycodestyle",
    "ruff",
    "cython-lint>=0.12.2",
    "rich-click",
    "click",
    "doit>=0.36.0",
    "pydevtool"
  ],
  "doc": [
    "sphinx!=4.1.0",
    "pydata-sphinx-theme==0.9.0",
    "sphinx-design>=0.2.0",
    "matplotlib>2",
    "numpydoc",
    "jupytext",
    "myst-nb",
    "pooch"
  ],
  "test": [
    "pytest",
    "pytest-cov",
    "pytest-timeout",
    "pytest-xdist",
    "asv",
    "mpmath",
    "gmpy2",
    "threadpoolctl",
    "scikit-umfpack",
    "pooch"
  ]
}

And if you want to put it in a script and are too lazy to add the necessary bits XD here you go:

#!/usr/bin/env sh
python -m pip inspect | jq ".installed[] | select(.metadata.name == \"$1\").metadata.requires_dist
  | map(select(contains(\"; extra ==\")) | capture(\"(?<dep>.+); extra == (\\\"|')(?<extra>[^']+)(\\\"|')\" ) )
  | group_by(.extra)
  | map( { (.[0].extra) : map(.dep) } )
  | add"

Nachtalb avatar Nov 03 '23 17:11 Nachtalb

I have create a tiny Python gist to print out package extras' requirements: get_package_extras.py.

mierzejk avatar Apr 05 '24 08:04 mierzejk

Do we still want a dedicated feature for this, now that pip inspect sort of covers this (and a lot more)?

uranusjr avatar Apr 09 '24 07:04 uranusjr

@uranusjr Indeed, all information is available with pip inspect. But the problem is pip inspect returns a huge JSON structure, so if you want to have just a bit of information, like package extras and their requirements, you have to parse and filter what pip inspect returns. If there is a sufficiently great number of people who would find extracting a subset of pip inspect output and filtering it with some parameters useful, I believe implementing such a functionality will be of value.

mierzejk avatar Apr 09 '24 09:04 mierzejk

You may close this old thread, and the idea may come back when a fundamental change will happen, like 'uv' integration

stonebig avatar Apr 09 '24 23:04 stonebig

Do we still want a dedicated feature for this, now that pip inspect sort of covers this (and a lot more)?

I do.

Parsing hundreds of lines of JSON is "real work". The most-promising solution (thanks @mierzejk !) is 25 lines of python, not exactly fodder for a bash alias.

I'm here on this thread again in late 2024 because I've searched for how to do this, and been led here from a (now useless) StackOverflow answer. When it's faster to look in the source, understand setup.py and/or pyproject.toml I think this deserves a feature, personally.

I'd be way more willing to take a stab at a PR if a concrete goal was outlined here.

Like: what does the option or subcommand look like?

(For me personally, I'd really like to have something like what https://github.com/pypa/pip/issues/7122 asks for -- both an error and helpful message when installing and unknown extra -- but that thread also doesn't appear to have consensus. Since this has come up for me several times recently, I'd love to have some "pip developer / ux guidance" on what to do here, what might be most-acceptable: a command asking for valid extras, or an error-message like 7122 wants? Does anyone have a few more hints towards what pip developers or other users would find acceptable here?).

For the purposes of spurring discussion, what about:

  • pip list-extras example-package
  • this outputs any valid extras for example-package one line at a time (or nothing if there are no extras)
  • that is if both pip install example-package[foo] and pip install example-package[bar] are valid, it would output:
$ pip list-extras example-package
bar
foo

meejah avatar Oct 02 '24 00:10 meejah

For the purposes of spurring discussion, what about:

pip list-extras example-package this outputs any valid extras for example-package one line at a time (or nothing if there are no extras) that is if both pip install example-package[foo] and pip install example-package[bar] are valid, it would output:

$ pip list-extras example-package bar foo

This 100% what I'm looking for

fabiencelier avatar Oct 02 '24 09:10 fabiencelier

For the purposes of spurring discussion, what about: pip list-extras example-package this outputs any valid extras for example-package one line at a time (or nothing if there are no extras) that is if both pip install example-package[foo] and pip install example-package[bar] are valid, it would output: $ pip list-extras example-package bar foo

This 100% what I'm looking for

FWIW this is equivalent:

$ pip list-extras apache-beam | jq -c '.installed.[].metadata | select(.name | contains("apache-beam")) | .provides_extra'
["docs","test","gcp","interactive","interactive-test","ml-test","p312-ml-test","aws","azure","dataframe","dask","yaml","torch","tensorflow","transformers","tft","onnx","xgboost","tensorflow-hub"]

timblakely avatar Mar 06 '25 20:03 timblakely