osv-scanner icon indicating copy to clipboard operation
osv-scanner copied to clipboard

License Scanner fail on Python packages without the License metadata key.

Open pcastellazzi opened this issue 9 months ago • 1 comments

I run osv-scanner scan --licenses=MIT . on a simple project to test it out. As you can see in the report below well known libraries are reported as "UNKNOWN"

╭──────────────────────────────────────────────────────────────┬───────────┬───────────────────────────┬─────────────────┬─────────╮
│ LICENSE VIOLATION                                            │ ECOSYSTEM │ PACKAGE                   │ VERSION         │ SOURCE  │
├──────────────────────────────────────────────────────────────┼───────────┼───────────────────────────┼─────────────────┼─────────┤
│ UNKNOWN                                                      │ PyPI      │ attrs                     │ 25.3.0          │ uv.lock │
│ non-standard                                                 │ PyPI      │ binaryornot               │ 0.4.4           │ uv.lock │
│ BSD-2-Clause                                                 │ PyPI      │ boolean-py                │ 4.0             │ uv.lock │
│ non-standard                                                 │ PyPI      │ chardet                   │ 5.2.0           │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ click                     │ 8.1.8           │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ colorama                  │ 0.4.6           │ uv.lock │
│ Apache-2.0                                                   │ PyPI      │ coverage                  │ 7.7.0           │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ foss-flame                │ 0.21.1          │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ iniconfig                 │ 2.1.0           │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ jinja2                    │ 3.1.6           │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ jsonschema-specifications │ 2024.10.1       │ uv.lock │
│ Apache-2.0                                                   │ PyPI      │ license-expression        │ 30.4.1          │ uv.lock │
│ non-standard                                                 │ PyPI      │ markupsafe                │ 3.0.2           │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ osadl-matrix              │ 2024.5.22.10535 │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ packaging                 │ 24.2            │ uv.lock │
│ BSD-3-Clause                                                 │ PyPI      │ psutil                    │ 7.0.0           │ uv.lock │
│ GPL-2.0-or-later                                             │ PyPI      │ python-debian             │ 1.0.1           │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ referencing               │ 0.36.2          │ uv.lock │
│ Apache-2.0 AND CC-BY-SA-4.0 AND CC0-1.0 AND GPL-3.0-or-later │ PyPI      │ reuse                     │ 5.0.2           │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ tomli                     │ 2.2.1           │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ typing-extensions         │ 4.12.2          │ uv.lock │
│ UNKNOWN                                                      │ PyPI      │ utools                    │ 0.1.0           │ uv.lock │
╰──────────────────────────────────────────────────────────────┴───────────┴───────────────────────────┴─────────────────┴─────────╯

I did a little digging on why this may be happening and i think it is related on how osv-scanner reads the licenses. My understanding is that osv-scanner read the information of a package from PyPI, that explains why utools my package is reported as unknown, since it is not published.

Checking the output of https://pypi.org/pypi/attrs/json i found out attrs is not using the field info.license, but info.license_expression instead. According to https://packaging.python.org/en/latest/specifications/core-metadata/#license it should take priority when present.

The package click provides its license as a classifier, which is the oldest method. According (again to PyPA), when the license used on the project is already registered as a valid classifier that must be used and the field info.license should be used for variations when needed.

pcastellazzi avatar Mar 21 '25 23:03 pcastellazzi

Hi @pcastellazzi we depend on deps.dev for license data and https://deps.dev/pypi/attrs indicates that attrs has a unknown license so that's why we display UNKNOWN here.

Can you open a bug with deps.dev for this? I also found this related bug https://github.com/google/deps.dev/issues/94

cuixq avatar Mar 23 '25 22:03 cuixq

This issue has not had any activity for 60 days and will be automatically closed in two weeks

See https://github.com/google/osv-scanner/blob/main/CONTRIBUTING.md for how to contribute a PR if you're interested in helping out.

github-actions[bot] avatar Jun 07 '25 01:06 github-actions[bot]

Automatically closing stale issue

github-actions[bot] avatar Jun 21 '25 02:06 github-actions[bot]