Tag-value example appears to have typo for "excludes" in Package Verification Code
Looking at the tag-value example file in the development/2.2.1 branch, the PackageVerificationCode line has an "excludes file" value:
https://github.com/spdx/spdx-spec/blob/cd0876094cfe96e5ecf5fb04c61244d5e077673a/examples/SPDXTagExample-v2.2.spdx#L78
However, according to section 3.9.6 of the 2.2 spec, I think there should be an excludes: sub-tag which is missing in the example file. So I think this should instead read:
PackageVerificationCode: d6a770ba38583ed4bb4525bd96e50461655d2758(excludes: ./package.spdx)
I'm happy to submit a PR to fix this, though I'm guessing this is generated by the SPDX Java tools so I'm wondering if it is an issue for https://github.com/spdx/tools instead...
It's a bigger problem than that.
- The spec says: 3.9.6 Tag: PackageVerificationCode: (and optionally (excludes: FileName))
- The example says:
<packageVerificationCodeExcludedFile>
./package.spdx
</packageVerificationCodeExcludedFile>
- The JSON Schema says:
"packageVerificationCodeExcludedFiles" : {
"description" : "A file that was excluded when calculating the package verification code. This is usually a file containing SPDX data regarding the package. If a package contains more than one SPDX file all SPDX files must be excluded from the package verification code. If this is not done it would be impossible to correctly calculate the verification codes in both files.",
"type" : "array",
"items" : {
"description" : "A file that was excluded when calculating the package verification code. This is usually a file containing SPDX data regarding the package. If a package contains more than one SPDX file all SPDX files must be excluded from the package verification code. If this is not done it would be impossible to correctly calculate the verification codes in both files.",
"type" : "string"
}
- The "BEER-WARE" test file says:
"packageVerificationCode" : {
"packageVerificationCodeExcludedFiles" : [ "./package.spdx" ],
"packageVerificationCodeValue" : "d6a770ba38583ed4bb4525bd96e50461655d2758"
},
The JSON schema and JSON example calls it "ExcludedFiles" (plural) and includes a single file in an array that could hold more than one. The XML example calls it "ExcludedFile" (singular) and includes one file inside the element.
Are all SPDX use cases satisfied by allowing only one file to be excluded from the verification code? If so, the JSON schema and example and property name should be changed from an array to a single value. If not, the tag and XML examples must be changed to show more than one.
@swinslow you are correct, and what you think should be there also matches my understanding.
@davaya:
- the RDF/XML tags may obviously repeat, so if you had two files excluded, you would add each one of them in
spdx:packageVerificationCodeExcludedFiletags - the ontology actually specifies that this package verification code excluded file has
min 0(no maximum) - whenever we have XML elements that might repeat, the convention is that JSON attributes are in plural and value is an array.
In conclusion, I don't find any issue with the spec; but the example file is missing the wording excludes:. We probably have not given much thought to how to represent multiple files in the tag-value format.
If tags may obviously repeat, then defining unambiguously which various multiplicities are valid would be helpful. Multiplicity is specified for PackageVerificationCode as a whole, but where is the Multiplicity for ExcludedFile(s) specified? (In other words, how would the spec be different if ExcludedFile were Mandatory, one?)
3.9.3 Cardinality: Mandatory, one if FilesAnalyzed is true or omitted, zero (must be omitted) if FilesAnalyzed is false.
Multiplicity: 1 0..1 0..*
<packageVerificationCodeExcludedFile> N Y? Y?
</packageVerificationCodeExcludedFile>
<packageVerificationCodeExcludedFile> Y Y Y
./package.spdx
</packageVerificationCodeExcludedFile>
<packageVerificationCodeExcludedFile> N N Y?
./package.spdx
./package2.spdx
./package3.spdx
</packageVerificationCodeExcludedFile>
<packageVerificationCodeExcludedFile> N N Y?
./package.spdx
</packageVerificationCodeExcludedFile>
<packageVerificationCodeExcludedFile>
./package2.spdx
</packageVerificationCodeExcludedFile>
<packageVerificationCodeExcludedFile>
./package3.spdx
</packageVerificationCodeExcludedFile>
The end product is an unambiguous specification that yields lossless translation between data formats. Single examples that cover just common cases aren't sufficient, tooling requires defined algorithms and rules that operationalize conventions. At a minimum, all examples of types that allow more than one should show more than one.
You are correct that this part of the spec is not great...
We should have written different section for each sub-field: packageVerificationCodeValue and packageVerificationCodeExcludedFile saying that the first has cardinality "one (mandatory)" and the second "zero or more (optional, multiple)".
I note that the next subsection also suffers from the same issue: nowhere it is specified that inside Checksum you have one algorithm and one checksumValue.
In general, we did not handle these sub-fields at all. When, in tag-value format, the "value" is a structure of more than one thing, we had left it without description in the specification. The ontology has the correct definitions, I think, but this does not cover all other formats.
Since we are replacing tag/value in 3.0+, I'm closing this issue.
cc: @kestewart