component-detection
component-detection copied to clipboard
Python: Dependencies are looked up from incorrect urls
When running component-detection on one of my projects I am seeing output like the following.
[INFO] Getting Python data from https://pypi.org/pypi/pywinpty>=1.1.0/json
[WARN] Received 404 Not Found from https://pypi.org/pypi/pywinpty>=1.1.0/json
[WARN] Dependency Package pywinpty>=1.1.0 not found in Pypi. Skipping package
It seems like component-detection is in some situations passing an incorrect url which contains the version specifier and obviously does not resolve. This seems to happen for transitive dependencies that are constrained to a specific version by some other direct or transitive dependency.
This is handled by this regex:
https://github.com/microsoft/component-detection/blob/17a1fb2bd96f3757c4b032fbb7f161f16f049550/src/Microsoft.ComponentDetection.Detectors/pip/PipDependencySpecification.cs#L32-L35
Testing this regex on pywinpty>=1.1.0
works, but requirements with optional dependencies like PyJWT[crypto]<3,>=1.0.0
will match as PyJWT[crypto]
which is incorrect. We probably need to add
|(?=\[)
to this regex and create a separate issue that tracks us not fetching the transitive dependencies of the optional depenedency.
Additionally, there are some Require-Dists
in this format:
Requires-Dist: numpy<1.27.0,>=1.19.5
that do not match this Regex: https://github.com/microsoft/component-detection/blob/17a1fb2bd96f3757c4b032fbb7f161f16f049550/src/Microsoft.ComponentDetection.Detectors/pip/PipDependencySpecification.cs#L13-L16
This is also likely a combination of custom packaging index feeds - #1129 should have support for this now.