tools-python
tools-python copied to clipboard
"don't work for SPDX v2.2"
On https://github.com/intel/cve-bin-tool/issues/1382, @anthonyharrison writes:
I am aware of these tools but when I looked at them they didn't work for SPDX v2.2 files (certainly the version in PyPi).
and mentions the test files in https://github.com/spdx/spdx-spec/tree/development/v2.2.2/examples
Is this correct?
@zvr The cve-bin-tool support for SPDX validates against the SPDX 2.2 examples. The comment related to the Python tools https://github.com/spdx/tools-python which do not currenty validate against the SPDX 2.2 standard.
Yes, @anthonyharrison, I know ;-) I saw your comment on the other repo and wanted to make people working on spdx-tools aware.
Ah, sorry: the "Is this correct?" question was not addressed to you; it was to the developers of spdx-tools.
@thanks for the report! Do you have a particular test case and do you mind to check if this works against the latest main branch?
@pombredanne The test case I have been using is to take the examples and use the test SPDX documents from https://github.com/spdx/spdx-spec/tree/development/v2.2.2/examples.
I have been using the latest code in the repo (and not the version released on PyPi).
For the tag value (using pp_tv.py I get an error reporting that the Annotation Type should be REVIEW or OTHER (which it is!); there is also an error reporting that the filename should be defined after the PackageName.
AnnotationType must be "REVIEW" or "OTHER". Line: 23
FileName Can not appear before PackageName, line: 41
followed by an OrderError exception
For the RDF file (using pp_rdf.py), I get an Index out of range exception when processing the SPDX_uri
doc.ext_document_references[-1].spdx_document_uri = spdx_doc_uri
IndexError: list index out of range
The actual use case that I am using is to extract the PackageName, Version pairs to use to find security vulnerabilites, so most of the content is ignored!
Serializing 2.2 per the jsonschema at https://github.com/spdx/tools-python/pull/197
Related to @anthonyharrison's comment above - would it make sense to automatically use https://github.com/spdx/spdx-spec/tree/master/examples automatically in tests? currently the master branch examples (e.g., https://github.com/spdx/spdx-spec/blob/master/examples/SPDXJSONExample-v2.2.spdx.json) also fail using the JSONParser for me
#211 is a duplicate of this issue resp. a subproblem of this one. As commented in the other issue I tried to analyze the errors described by @anthonyharrison but couldn't fix them yet:
I had a deeper look at the errors you provided. The first one is rather misleading, it is thrown by the tagvalue parser here because the lexer couldn't parse the given line. The error message should be adjusted here. As far as I understand the error is produced here but I don't know yet how to fix this.
Concerning the second error: The current tagvalue-parser expects the ordering of a SPDX-Document so that the packages are placed first and the files come afterwards. The parsing of some fields is based on this ordering, e.g. here. This also needs a deeper understanding of the tagvalue-parser.
As discussed in #244 we will soon refactor the whole data model and the builder/ parser-structure. In this context we will then also take care of the errors described here.
I would agree with @nettrino that it would be best to use the examples from the spec for the tests.
I would close this issue for now as the described errors for tag-value files are fixed and all example files for version 2.2 except the rdf example from the spec repo can be parsed now. Concerning the rdf file I opened another issue (#323) which is more specific than this one. If anyone has any objections, please ping and/or reopen.