syft
syft copied to clipboard
Enhance SPDX Support
Completing https://github.com/anchore/syft/issues/213 adds support for generating SPDX documents, however, there are several opportunities to expand upon what can be expressed in an SPDX document. For instance, we have a file cataloger and are aware for select package types which files belong to a package... we could be leveraging the "FilesAnalyzed" package indication and showing the file digests for these package files. Additionally we could be adding relationships between these files and packages. For packages with transitive dependencies known, we could be building a dependency graph. There are many directions this can take, this issue is here as a placeholder to show that we could be supporting more of what can be expressed in an SPDX document.
Explorations to try (but not limited to):
- [ ] #476
- [ ] #477
See #451 as it's possible that filesAnalyzed
mandates packageVerificationCode
as per: https://spdx.github.io/spdx-spec/3-package-information/#39-package-verification-code
WRT describing dependencies in the SPDX output, I particularly love the idea because IMHO it's existing metadata which can be potentially valuable for the SBOM consumer, but I can also see the argument for "flat SBOMs". I would say there are two levels, one would be literally taking a big string with the dependency information for each package and adding it as a comment or something semantically equivalent to a comment in the SPDX document, the next level would be converting those package names from their native names to the SPDX identifier or purl spec for proper linkage.
Another thing which I'm not sure if you're interested in considering is adding any leftover files to the final SBOM document, in particular SPDX supports files as you know: https://spdx.github.io/spdx-spec/4-file-information/
This would be for anything else in the tree that hasn't been accounted for and that is relevant to the SBOM (i.e., not stuff under /home, etc., unless explicitly included), when thinking in terms of utilities I think about the output of debsums -c
or cruft
as potential signals that I might need to explicitly call out a file in addition to packages. One particularly interesting scenario is for Dockerfile
with RUN wget foo
that leave files we might want to account for, maybe it's the golang release, or terraform, or the jx client, we can already hash them and run libmagic on them but if we can also make it easy for SBOM publishers to identify which layer introduced the file maybe they can easily add a comment showing what line in the Dockerfile (or in their packer
script or in their debootstrap
includes...) introduced the component as some form of auxiliary attestation. (Also mentioned in #435)
Just a few ideas. Thanks for giving SPDX enrichment additional thought!