tern
tern copied to clipboard
Add SPDX generation using spdx-tools
This is set up to produce the same SPDX output as the current spdx generation module while utilising the spdx-tools library. The goal is to replace the current module with this new one, which will allow easy migration to more SPDX formats as well as SPDXv3. I tried to stay close to the structure of the original implementation.
I tested this using the following commands:
tern report -i golang:1.12-alpine -f spdxjson -o spdx_test.json
and
tern report -i golang:1.12-alpine -f spdxjson_new -o new_spdx_test.json
I compared the resulting json files using jd, treating arrays as unordered sets. The only differences were in timestamps, UUIDs, and two differences in how json output is generated:
- The enum value in spdx-tools is
PACKAGE_MANAGER, notPACKAGE-MANAGER - The optional key
documentDescribeshas been deprecated and is therefore not used by the spdx-tools. Only the correspondingDESCRIBESrelationship is serialized.
I added support for YAML, XML and RDF-XML formats.
@armintaenzertng Thank you very much for all your work on this! How does one denote which version of SPDX documents they want using this PR? I am assuming this PR, by default, generates SPDX 2.3 documents. However, we can't drop support for SPDX 2.2 since we have users who want it because it is the ISO standard version. There needs to be a way to denote SPDX version to generate on the command line before we can merge this.
This PR currently replicates the behavior of the current state. That is, SPDX 2.2 is hardcoded into the output. I will gladly add support for multiple SPDX versions, but we should clarify how this would work in the current workflow.
- Should there be an optional parameter that denotes the SPDX version (that gets ignored if no SPDX output is required)? If yes, how do we call that?
- Or will we add two (and more in the future) parameters for the
-fflag, likespdxjson2.2,spdxjson2.3,spdxjson3.0etc.? This is probably easier to implement, but will lead to much code duplication, probably not the best idea.
After some further consideration and having a deeper look at the code, point 2 from above might be the better alternative after all. I'll try to implement that.
I added the versioning I described above. @rnjudge, please have a look if that is OK with you. :)
@armintaenzertng do you want to schedule a zoom call about this? I would like to avoid mass code duplication as that was the whole point of using the SPDX tools library.
Yes, certainly! :) Do you have my email address?
I added a CLI version parameter for the output format. Formats that don't support this (i.e. everything except for SPDX) will raise an error if this is set.
I added SPDX-2.3 functionality.
In particular, this means that if the SPDX version is 2.3, we set the primary_package_purpose of the container package to CONTAINER and omit concluded license, declared license and copyright text in SpdxPackages if possible.
@rnjudge: tern report -i photon:3.0 -f spdxjson -sv 2.3 -o output.json followed by pyspdxtools -i output.json yields no errors or invalidations, so I'll mark this "Ready for review" now.
There are still some open issues regarding Spdx documents with file information, which I collected in #1240. As these existed in the old SPDX implementation already, I'd propose to fix them in a separate PR after this one to keep things clearer (this one is already quite large).
This would fix https://github.com/tern-tools/tern/issues/1211
@rnjudge @armintaenzertng For specifying the SPDX version of the output, I would propose the following:
tern report -f [email protected] -i golang:1.12-alpine -o alpine.spdx
and
tern report -f [email protected] -i golang:1.12-alpine -o alpine.spdx.json
Advantages would be that we have one less argument and it is similar to the Syft syntax: https://github.com/anchore/syft/
spdx-tag-value: A tag-value formatted report conforming to the SPDX 2.3 specification
[email protected]: A tag-value formatted report conforming to the SPDX 2.2 specification
spdx-json: A JSON report conforming to the SPDX 2.3 JSON Schema
[email protected]: A JSON report conforming to the SPDX 2.2 JSON Schema
That is what I had prototyped in https://github.com/tern-tools/tern/pull/1228
What do you think?
@vargenau: I believe the current implementation suggestion is a little more flexible in supporting more versions/formats in the future. Your proposed solution would result in five additional entrypoints (one per spdx format) per version.
Still, if necessary, the [email protected] usecase can be easily implemented by adding a new entrypoint with the generator just calling get_spdx_from_image_list with the right parameters.
Hi @armintaenzertng My proposal aim was:
- to have one less option
- to have a syntax similar to Syft If it is more complex to implement, no problem.
@rnjudge:
tern report -i photon:3.0 -f spdxjson -sv 2.3 -o output.jsonfollowed bypyspdxtools -i output.jsonyields no errors or invalidations, so I'll mark this "Ready for review" now. There are still some open issues regarding Spdx documents with file information, which I collected in #1240. As these existed in the old SPDX implementation already, I'd propose to fix them in a separate PR after this one to keep things clearer (this one is already quite large).
Is is really -sv 2.3 ?
In Rose message above, it was parser_report.add_argument('-fv', '--format-version',
I meant -sv, this was changed according to this comment.
Small update: I changed the spdx-tools dependency to the new 0.8.1 version.
small fix: I changed the check for the SPDX version to use the official string "SPDX-2.2" consistently.