syft icon indicating copy to clipboard operation
syft copied to clipboard

Can not have license ID

Open bj1116 opened this issue 1 year ago • 5 comments

Can not have license ID Exec Command :

syft packages sonatype/nexus3 -o cyclonedx-json

But can not have license id :

      "licenses": [
        {
          "license": {
            "name": "http://www.apache.org/licenses/LICENSE-2.0.txt"
          }
        }
      ],

Good json:

      "licenses": [
        {
          "license": {
            "id": "Apache-2.0"
          }
        }
      ],

bj1116 avatar Jul 27 '23 11:07 bj1116

{
  "$schema": "http://cyclonedx.org/schema/bom-1.4.schema.json",
  "bomFormat": "CycloneDX",
  "specVersion": "1.4",
  "serialNumber": "urn:uuid:e1d1d486-6700-4643-8a40-ddd50cb1788a",
  "version": 1,
  "metadata": {
    "timestamp": "2023-07-27T19:08:01+08:00",
    "tools": [
      {
        "vendor": "anchore",
        "name": "syft",
        "version": "0.85.0"
      }
    ],
    "component": {
      "bom-ref": "106ffa560fa7d498",
      "type": "container",
      "name": "sonatype/nexus3",
      "version": "sha256:65878677e7195c74d34871eb7a2acfd5d9051890731675ca62f535e8e11dd404"
    }
  },
  "components": [
    {
      "bom-ref": "pkg:maven/org.hdrhistogram/[email protected]?package-id=81b53904decf82ec",
      "type": "library",
      "group": "org.hdrhistogram",
      "name": "HdrHistogram",
      "version": "2.1.12",
      "licenses": [
        {
          "license": {
            "name": "http://creativecommons.org/publicdomain/zero/1.0/, https://opensource.org/licenses/BSD-2-Clause"
          }
        }
      ],
      "cpe": "cpe:2.3:a:HdrHistogram:HdrHistogram:2.1.12:*:*:*:*:*:*:*",
      "purl": "pkg:maven/org.hdrhistogram/[email protected]",
      "externalReferences": [
        {
          "url": "",
          "hashes": [
            {
              "alg": "SHA-1",
              "content": "6eb7552156e0d517ae80cc2247be1427c8d90452"
            }
          ],
          "type": "build-meta"
        }
      ],
      "properties": [
        {
          "name": "syft:package:foundBy",
          "value": "java-cataloger"
        },
        {
          "name": "syft:package:language",
          "value": "java"
        },
        {
          "name": "syft:package:metadataType",
          "value": "JavaMetadata"
        },
        {
          "name": "syft:package:type",
          "value": "java-archive"
        },
        {
          "name": "syft:cpe23",
          "value": "cpe:2.3:a:hdrhistogram:HdrHistogram:2.1.12:*:*:*:*:*:*:*"
        },
        {
          "name": "syft:location:0:layerID",
          "value": "sha256:dfd807fb9360ed946984f55eec1c495463e647c0e486af4f2b8099311c8e56bb"
        },
        {
          "name": "syft:location:0:path",
          "value": "/opt/sonatype/nexus/system/org/hdrhistogram/HdrHistogram/2.1.12/HdrHistogram-2.1.12.jar"
        },
        {
          "name": "syft:metadata:-:artifactID",
          "value": "HdrHistogram"
        },
        {
          "name": "syft:metadata:-:groupID",
          "value": "org.hdrhistogram"
        },
        {
          "name": "syft:metadata:virtualPath",
          "value": "/opt/sonatype/nexus/system/org/hdrhistogram/HdrHistogram/2.1.12/HdrHistogram-2.1.12.jar"
        }
      ]
    },

bj1116 avatar Jul 27 '23 11:07 bj1116

Hi, I wanted to add a bit more context on this, hopefully providing some useful info.

We're also affected by this, as tools such as dependency-track rely on license id to keep track of license usages.

In java there are multiple ways to express a jar's license, so far I've been able to identify these cases:

  • Bundle-License: https://www.apache.org/licenses/LICENSE-2.0.txt in META-INF/MANIFEST.MF: usually this is an URL pointing to the license itself. This is the current method syft uses to retrieve the license and produces a license name entry in the generated bom. License ID could potentially be inferred from this url
  • Include-Resource: META-INF/LICENSE.txt=LICENSE.txt another attribute in META-INF/MANIFEST.MF, sometimes the license is described as an actual file, included in the archive and referenced as relative path. In this case License ID could be parsed from the textual license. Note that the Include-Resource and Bundle-License attributes could both be present in the MANIFEST file.
  • None of the above: I've found license files sitting in META-INF as-is, with no reference in the MANIFEST. Names might vary, so far I've found: LICENSE, LICENSE.txt and license.txt

Hope this helps.

rogueai avatar Aug 21 '23 16:08 rogueai

Thanks @rogueai for the context - Is there a value dependency-track will accept if an SBOM tool is unable to determine the license ID? For example https://www.foo.org/licenses/LICENSE-2.0.txt <-- If the URL is the reference, but we're running in an air gapped scenario and cannot derive it from the text of the URL what would you expect the field to include so that you are not getting an error?

spiffcs avatar Aug 21 '23 18:08 spiffcs

Hi @spiffcs, thanks for looking at this!

Quoting dependency-track's readme it just says:

Supports standardized SPDX license ID’s

So I suppose DT only recognizes license references as SPDX ID, and specified as <id> (not <name>) in the bom as in this snippet:

<licenses>
  <license>
    <id>Apache-2.0</id>
  </license>
</licenses>

The tool has an internal list of all the possible license definitions, and uses their SPDX IDs to correlate components with.

Just a clarification: we're not getting any "hard errors", but all our components report "license violations" as DT cannot correlate, say, "name: https://www.apache.org/licenses/LICENSE-2.0.txt" to "id: Apache-2.0".

IMHO, in a more general view and ignoring dependency-track specifics, if a valid SPDX License ID cannot be retrieved, it's fine to just leave it in the <name> field, I think it should be a "best-effort" case to retrieve the id, while keeping the bom valid in the worst case scenario.

rogueai avatar Aug 21 '23 19:08 rogueai

references #725

spiffcs avatar Feb 09 '24 18:02 spiffcs