syft icon indicating copy to clipboard operation
syft copied to clipboard

No licenses included in scan with yarn.lock

Open geret1 opened this issue 3 years ago • 7 comments

What happened:

Get SBOM from a project result in an SPDX-JSON file with all licenses NONE.

What you expected to happen:

SPDX-JSON file with licenses concluded differently than NONE

How to reproduce it (as minimally and precisely as possible):

As we discussed in Slack, you only need to have a dummy nodeJS project with yarn.lock and node_modules

Anything else we need to know?:

Nope.

Environment:

  • Output of syft version: 0.38.0
  • OS (e.g: cat /etc/os-release or similar): OS X (tested locally)

geret1 avatar Feb 23 '22 19:02 geret1

Same here. I just scanned a couple of codebases, and Syft lists no licenses – using the straight JSON output format. The corpus includes packages.json and packages-lock.json files as well as requirements.txt files with many well known packages (Angular, React, popular Python packages...).

martin-langhoff avatar Mar 08 '22 16:03 martin-langhoff

Seems related to discussion in #229

martin-langhoff avatar Mar 08 '22 17:03 martin-langhoff

Hi, just a note here — Syft is inspecting different files for package evidence depending on whether Syft is scanning a local directory or a container image. For directory scans, Syft looks for evidence of packages that are described but not necessarily installed, so files like package-lock.json, yarn.lock and requirements.txt. Also, Syft doesn't (yet) perform any remote queries to supplement its discovered package data — all analysis is static and local.

This means that if Syft scans a file that doesn't have license information, there's no way for Syft to surface license information.

Syft behaves differently when scanning container images, because its strategy shifts to finding evidence of software that has already been installed. In this case, Syft looks for files like package.json as proof for installation, and in these files, Syft is able to detect license information.

One common enhancement request is for users to be able to control when Syft looks for what kinds of evidence. We're tracking that here: https://github.com/anchore/syft/issues/465

luhring avatar Mar 25 '22 13:03 luhring

But, if we run Syft locally in our projects (i.e. Javascript project) it would be nice to take a look at node_modules to identify licenses in the supply chain and it is static analysis no? we can apply the same procedure into a CICD pipeline.

geret1 avatar Mar 25 '22 16:03 geret1

Right, we can adjust Syft to enable that method of scanning — that's what we're talking about in #465 (see above).

luhring avatar Mar 28 '22 17:03 luhring

any updates? it seems data catalogs are included and merged but not sure if it covers this issue

geret1 avatar Jul 15 '22 06:07 geret1

Hi There, just stumbled across this. Would it be possible to just take the license from the package-lock.json? lockfileversion 2 includes this field (not in yarn.lock or other lock files). I would also be interested to do it as kind of a first time contribution? But I can't estimate if thats a difficult issue. I also read that one day syft will do dynamic analysis so I'm not sure if it would be "worth it".

example:

    "node_modules/@angular-devkit/architect/node_modules/rxjs": {
      "version": "6.6.7",
      "resolved": "https://bahnhub.tech.rz.db.de:443/artifactory/api/npm/default-npm-3rdparty/rxjs/-/rxjs-6.6.7.tgz",
      "integrity": "sha1-kKwBisq/SRv2UEQjXVhjxNq4BMk=",
      "dev": true,
      "license": "Apache-2.0",
      "dependencies": {
        "tslib": "^1.9.0"
      },
      "engines": {
        "npm": ">=2.0.0"
      }
    },

henrysachs avatar Aug 05 '22 10:08 henrysachs