syft icon indicating copy to clipboard operation
syft copied to clipboard

feat: dpkg license improvement for non SPDX licenses

Open spiffcs opened this issue 1 year ago • 1 comments

What happened: Sometimes syft can encounter a dpkg license where the regular expression used to match on contents cannot correctly identify the license.

In the following example we should find things like:

NVIDIA Software License Agreement and CUDA Supplement to Software License Agreement

Reads contents of copyright: https://github.com/anchore/syft/blob/ca945d16e0949a41aa8786f55d21908242b224c8/syft/pkg/cataloger/debian/package.go#L252-L276

Sends contents for parsing

https://github.com/anchore/syft/blob/ca945d16e0949a41aa8786f55d21908242b224c8/syft/pkg/cataloger/debian/package.go#L101-L106

Searches for license clause

https://github.com/anchore/syft/blob/48f1e975f05183390d7c01718865f5f66e3f9012/syft/pkg/cataloger/debian/parse_copyright.go#L22-L41

What you expected to happen: Given a copyright file is found SOME license information should be created for a given package. No licenses is a bug.

Steps to reproduce the issue:

syft -o json nvidia/cuda:12.5.1-cudnn-runtime-ubuntu20.04 | grant list -o json | jq -r '.results[]
 | [.license.license_id, .license.name] | @csv' | sed 's/"//g'
  • Output of syft version: devel (tip of main)
  • OS (e.g: cat /etc/os-release or similar): OSX

spiffcs avatar Aug 01 '24 20:08 spiffcs

I've tracked down a couple data sources syft could use to identify non SPDX licenses - currently looking at ways to incorporate these to the licenses identification when generating the SBOM

https://github.com/nexB/scancode-toolkit https://github.com/nexB/scancode-licensedb

spiffcs avatar Aug 08 '24 17:08 spiffcs

If you'd like a simplified solution to include custom licenses, you might want to take a look here: https://github.com/HeyeOpenSource/syft/tree/Custom_Licenses 😁

N.B.: I just ran make test on it without any failures.

HeyeOpenSource avatar Oct 22 '24 12:10 HeyeOpenSource

Reopening this as #3412 and #3876 don't solve this for all cases. Now that both of those are in we need a more precise change that addresses this for the dpkg cataloger.

spiffcs avatar May 13 '25 19:05 spiffcs