rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

RFC: SBOM generation

Open bdehamer opened this issue 1 year ago • 17 comments

Rendered RFC

bdehamer avatar Aug 07 '23 21:08 bdehamer

Long before generation, I think we'd need to ensure that the following things are addressed:

  • validation tools, that can confirm that a generated SBOM is valid
  • since there's 6 (or 7?) kinds of SBOMs, and npm packages can be used for multiples of them, how would this command handle generating those? production deps are different than dev deps, and the build environment is important too and may have nothing to do with npm
  • a validation tool that package authors can run to determine if end users will be able to generate a valid SBOM (and/or, help an author make needed changes to allow for a more complete SBOM to be created)

ljharb avatar Aug 08 '23 04:08 ljharb

since there's 6 (or 7?) kinds of SBOMs...

Given that one of the primary functions of the npm CLI is to manage a project's dependencies, I see this new command being useful primarily for the creation of Source- or Build-type SBOMs where it is important to capture the dependencies for an artifact.

Similar to the work done to capture package provenance, it's not hard to imagine the Build SBOM expanding in the future to also record information about the build process at the time of package publication. However, in the spirit of iterative improvement, generating a basic SBOM enumerating a project's dependencies seems like a good starting point.

production deps are different than dev deps

I think I addressed this in the RFC already (see the --omit flag and the format-specific annotations for dev dependencies) but lemme know if you think I should further clarify.

validation tools, that can confirm that a generated SBOM is valid

Not sure I understand this point -- the goal would be to have the CLI generate a valid SBOM in one of the two supported formats. I can imagine having a suite of integration tests to ensure that the SBOMs generated from the CLI are compatible with tools that consume SBOMs (things like osv, GH Dependency submission API, or snyk).

...determine if end users will be able to generate a valid SBOM

The CLI is already pretty good at detecting dependency issues. I would imagine that issues like missing or extraneous dependencies would be reported as errors when executing the command. Perhaps I should add some text to the RFC to make this explicit.

bdehamer avatar Aug 08 '23 21:08 bdehamer

Super excited to see this RFC - Thanks @bdehamer for proposing this!

validation tools, that can confirm that a generated SBOM is valid

Just for reference, we do have an online validation tool for SPDX files which may be helpful during development of this feature to check if it is producing valid SPDX. Since it is online, you wouldn't want to use it as part of the actual CI/CD or part of the NPM code itself.

We also have a command line implementation of SPDX validation in Java and one in Python. I know the Java command line utility is used in some CI/CD environments to validate the produced SPDX file.

Unfortunately, we don't have one (yet) in JavaScript. We are looking for volunteers to implement JavaScript validation if anyone is interested in contributing.

We also have a JSON Schema file you can use to validate the syntax.

goneall avatar Aug 09 '23 00:08 goneall

@bdehamer what i mean is, how can someone independently verify the validity of an SBOM? it's very easy to determine the validity of JSON or XML or anything with a schema - where's the open source validation tool that tells me that npm did the right thing?

ljharb avatar Aug 09 '23 21:08 ljharb

@bdehamer what i mean is, how can someone independently verify the validity of an SBOM?

@ljharb are you thinking of ways to validate that a SBOM can be reproduced given the same inputs later? One way could be to re-run npm ci given the original package json and lockfile and verify that the packages and versions you get in node_modules is exactly what's in the SBOM.

My understanding is that npm doesn't always provide reproducible installs of node_modules. But it would probably be enough for verifiable SBOMs if we can ensure we always get the same packages and versions, not that every single installed file is 100% reproducible.

feelepxyz avatar Aug 10 '23 08:08 feelepxyz

While that’s an important thing to have, i just mean, how do we know npm has implemented things correctly?

Is there anything package authors will be asked to “fix” to help produce better SBOMs? If so, how can package authors find this out in advance, before a slew of issues is filed?

ljharb avatar Aug 10 '23 09:08 ljharb

re https://github.com/npm/rfcs/pull/714#issuecomment-1672859758

Is there anything package authors will be asked to “fix” to help produce better SBOMs? If so, how can package authors find this out in advance, before a slew of issues is filed?

the same it happens with npm-ls :-)

jkowalleck avatar Aug 10 '23 09:08 jkowalleck

@bdehamer this RFC is missing the fact that dependencies are not deducplicated by all means. Maybe this shall be another RFC, after ratification of this one.

example deps:

my-application
    ├── [email protected]
    ├── [email protected]
    │       ├── [email protected]
    │       └── [email protected]
    └── [email protected]
            ├── [email protected]
            └── [email protected]

how should [email protected] be handled? It exists multiple times in the project, with different module resolution graphs. Technically, even the code might be the same, both [email protected] are NOT the same, since they utilize different versions of ansi-regex.

see also:

  • https://github.com/CycloneDX/cyclonedx-node-npm/blob/main/docs/component_deduplication.md
  • https://github.com/CycloneDX/cyclonedx-node-npm/blob/main/docs/result.md

jkowalleck avatar Aug 10 '23 10:08 jkowalleck

@ljharb Feel free to try this util. for validation of either SDPX or CycloneDX formats:

  • https://github.com/CycloneDX/sbom-utility#sbom-utility

Specifically:

  • https://github.com/CycloneDX/sbom-utility#validate

It allows a granularity of control on error output as well as intended to work well in command line toolchains.

mrutkows avatar Sep 13 '23 15:09 mrutkows

For SPDX validation, I would recommend either the online tools validate function or the tools-java command line utility Verify command.

In addition to the schema validation, it validates some of the parameter string formats and relationship restrictions that can't be easily validated in the JSON schema (e.g. validating a license expression parses correctly).

goneall avatar Sep 13 '23 16:09 goneall

There is now an implementation PR up for this.

wraithgar avatar Sep 15 '23 17:09 wraithgar

Thanks to everyone for the great feedback/discussion! There seems to be general consensus that this feature is worth adding to the CLI and that the proposed approach is correct.

I've got a :white_check_mark: from @puerco from the SPDX camp and would love to get the blessing from someone on the CycloneDX side (perhaps @mrutkows or @stevespringett). Once ratified we can move on to the next step . . .

There is now an implementation PR up for this.

The code in ☝️ PR was used to generate the samples that appear at the bottom of the RFC.

bdehamer avatar Sep 18 '23 16:09 bdehamer

As sbom generator tools are updated on a regular basis, it would be good idea to monitor the quality of the sbom. https://github.com/interlynk-io/sbomqs helps by generating a quality score for the sbom, which can be used in the pipeline to accept or reject it. image

riteshnoronha avatar Sep 28 '23 20:09 riteshnoronha

My understanding of this is that the cli PR landed and this the sbom command shipped sometime last year. Should this PR be closed out/merged?

wesleytodd avatar Feb 09 '24 17:02 wesleytodd

Closed, since the RFC was never approved or fully evaluated.

ljharb avatar Feb 09 '24 18:02 ljharb

Since the public RFC calls are not run, is there an officially documented way to say this was or was not reviewed and/or approved? I agree that I was surprised this landed in the cli (didn't even know about it until early Jan) and AFAIK it implemented incorrect SBOMs (that is hearsay on my part, but from people I trust), but it appears to me that this put the cart before the horse. I don't think it was ever documented that the RFC must merge before the feature is implemented, but it is unfortunate to have the cli feature land while this is sitting with "merging is blocked" status and no approvals of either community members or cli maintainers.

wesleytodd avatar Feb 09 '24 18:02 wesleytodd

I think that without the public RFC calls, or a replacement RFC process, this entire repo should probably be archived.

ljharb avatar Feb 09 '24 18:02 ljharb