tools-python icon indicating copy to clipboard operation
tools-python copied to clipboard

Add unpackaged files per SPDX spec

Open bjamesv opened this issue 2 years ago • 5 comments

Add Files without associated packages, as allowed by the spec but not currently supported in Document class.

  • [x] unpackaged Files data model & method (Document.add_file)
  • [x] tag/value Writer, Parser & tests
  • [x] RDF Writer, Parser & tests
  • [x] JSON-yaml-XML Writer, Parser & tests

Also leverages the unpackaged Files data model, to fix rdf parser bug where every referencesFile are incorrectly added to the last-listed Package.

Resolves #181

Signed-off-by: Brandon J. Van Vaerenbergh [email protected]

bjamesv avatar Jun 13 '22 22:06 bjamesv

can we get review/merge going for this, please?

has been very helpful generating SPDX with our tools-python fork

bjamesv avatar Aug 03 '22 16:08 bjamesv

PR motivation, in part derived from this 2017 kestewart response:

Screen Shot 2022-08-03 at 10 00 42 AM https://lists.spdx.org/g/spdx/topic/22079688#1127

Happy to discuss! 🙇

bjamesv avatar Aug 03 '22 17:08 bjamesv

Thanks @bjamesv for the extensive PR! Very much appreciated.

I did a review primarily for the spec - I'm not much of a Python programmer, so I won't be much help on the actual code.

I left a few requested changes in the review. Let let me know if you need any additional information or context.

goneall avatar Aug 03 '22 21:08 goneall

Testing against latest example files is showing me 3-4 current tools-python oddities with parsing annotations, license info, etc. which don't seem like they would be too bad to include in this PR. 👍 Might help incrementally resolve #196 too.

... It would be great if you could update the PR to use the documentDescribes relationship for both files and packages. See spec issues 543 and Pull Request 38 comment for context.

This correction was proposed in Nov. 2021 PR #197 (Issue #184). @goneall can you rebase #197 on main HEAD and see if the actions pass/review can be concluded?

Issues are certainly adjacent, but I hesitate to incorporate those fixes into this PR. I would happily refactor this PR so it works on top of #197 if that was merged, though.

bjamesv avatar Aug 05 '22 17:08 bjamesv

This correction was proposed in Nov. 2021 PR #197 (Issue https://github.com/spdx/tools-python/issues/184). @goneall can you rebase https://github.com/spdx/tools-python/pull/197 on main HEAD and see if the actions pass/review can be concluded?

I just requested @dholth to rebase his PR. If it passes CircleCI, I'll merge that into main.

goneall avatar Aug 05 '22 18:08 goneall

@bjamesv The fixed version of https://github.com/spdx/tools-python/pull/197 was merged into main by now.

I have to admit that I am quite skeptical of the changes in this PR, though. Clearly, the data model has to be changed to allow for files outside of a package, but I would like to do the change in a way that aligns the data model with the spec. Concretely: Instead of having files both on the document level ("unpackaged files" in this PR) and on the package level, only include them on the document level. If a file is included in a package, that should be represented by a CONTAINS relationship. (At least this is my understanding, @goneall correct me if I'm wrong)

Removing files from the package level may be quite a bit of additional work. @bjamesv do you want to continue with this? We are also planning to do a major redesign of the data model after a release of the current state, that would also include this change.

nicoweidner avatar Nov 04 '22 16:11 nicoweidner

As there doesn't seem to be any more activity on this PR and the issue of removing the nested files in packages is still outstanding, I will close this PR. Work along these lines will continue in https://github.com/spdx/tools-python/pull/288.

Sorry this got ignored for so long, @bjamesv !

nicoweidner avatar Nov 15 '22 09:11 nicoweidner