purl-spec icon indicating copy to clipboard operation
purl-spec copied to clipboard

Clarification on how to refer to platform-specific pypi packages

Open petergardfjall opened this issue 4 years ago • 2 comments

A Python package sometimes differs in content/dependencies depending on things such as the targeted operating system, version of the Python interpreter (2 vs 3, for instance) and depending on which "extras" (optional features) of the package the consumer is interested in. This effectively results in different "editions" of the same package.

I assume that purl qualifiers would be used to represent these different editions, correct? For example:

pkg:pypi/[email protected]?extras=security,socks

pkg:pypi/[email protected]?python_version=2.7

and similar. The README doesn't elaborate on any special cases like these for pypi purls, whereas for example the description of maven purls contain quite a few examples with qualifiers.

So I guess I'd like to know if qualifiers are the natural construct to represent these python package "variations" and it would also be interesting to learn of any example use in the wild of pypi purls?

Edit: this is not necessarily a request to update the documentation. Consider it a question.

petergardfjall avatar Aug 24 '20 06:08 petergardfjall

This would be very helpful as it impacts how complete the SBOM for python packages is. At the current state, the list of dependencies for a package to audit a supply chain would not be complete without including extras as those often imply additional dependencies to be installed. It would be also highly beneficial to include other specifiers that are within the package JSON metadata such as :

  • filename
  • md5 (or maybe also sha256) digest
  • python_version
  • packagetype
  • requires_python
  • url

For auditing purposes I feel like it is important that there is at least one way to precisely select a package file that is/would be installed such as the mentioned MD5 checksum or filename. The pkg:pypi/requests purl could mean something different when it's on mac vs windows or other os or a different python version.

The extras specifier is also helpful as mentioned it changes what dependencies are installed (important for generating SBOM and auditing). As far as I am aware it does not modify the location of the package as that is extracted during installation time from the package itself (setup.py or wheel metadata), nevertheless, as mentioned it is also very helpful to have it included.

For reference, here is the package json metadata from pypi: https://pypi.org/pypi/requests/json The change itself looks quite simple and it would greatly improve usability for pypi purls. If there would be an interest I could give it a shot and draft up a pull request with those changes.

RootLUG avatar Jul 09 '21 12:07 RootLUG

I dived more into the specs of the purl as it seems that within the specification there are few qualifiers that are valid for all package types even if not mentioned explicitly in the PyPI purldocumentation; such as download_url, file_name, and checksum, link to that section in purl spec: https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst#known-qualifiers-keyvalue-pairs

I guess this solves half of the issue

RootLUG avatar Jul 09 '21 12:07 RootLUG