tools-golang
tools-golang copied to clipboard
SPDX tag-value parsing fails for ExternalDocumentRef in document.
The attached file fails to parse with the following error message (please remove .txt before processing)
Error: received unknown tag ExternalDocumentRef in CreationInfo section
ExternalDocumentRef is a valid tag with 1+ cardinality, so this appears to be a bug.
My guess is we are missing ExternalDocumentReference check in this enumeration of tags: https://github.com/spdx/tools-golang/blob/main/spdx/v2/v2_2/tagvalue/reader/parse_creation_info.go#L26-L101
It got lost around this refactor: https://github.com/spdx/tools-golang/commit/460cf54ece7eba418c50407ee35544c5d63588a7#diff-dc282dc5935546282e5429e56eda94391c511ea039311c1d9db2f6501f730d4cL41
Hi @surendrapathak -- a quick look makes me think your assessment is correct -- it looks like some processing got removed that would handle this. The tag-value format is very difficult for me to understand completely; one of the last SPDX WG meetings I was told that certain ordering of elements results in different processing. Is there some definition of the exact parsing rules that should be used?
@kzantow Thanks for checking it out. Here is what I think is going on:
-
parsePair is responsible for running parser with implicit assumption that state of the parser (
parser.st
) will be updated as tags starting new sections are found. I think that's why ordering of element was mentioned. -
parsePairFromStart is meant to capture all tags (including
ExternalDocumentRef
) that should always exist before CreationInfo section begins (likely withLicenseListVersion
). - Once a new tag that does not belong to above list is found, CreationInfo is expected.
- From CreateInfo, new sections (and subsequent state of the parser) can begin e.g Package and File and so on.
Assuming that's correct, the problem with this file is that ExternalDocumentRef:
exists after Creator:
while the tag-value parser expects it to come before that.
If that's correct then the problem is with the SPDX generator itself (Zephyr SPDX builder) rather than this code. Can anyone from the spec team review and confirm this?
Hi @surendrapathak and @kzantow,
Thanks for the comments on this. I don't think the issue is with the linked SPDX file being invalid. It passes validation using the SPDX online tools which I believe are using the SPDX java tools under the hood.
I'm taking a look at the past and current tools-golang tag-value parser code, will circle back in a bit with a few thoughts.
Removing 0.5.0 milestone tag since there is not yet a resolution near the release deadline.