Reuse style file tags and how to support them with SPDX3.0
The Annex H currently defines File tags by referencing the tag value format (as it is defined in 2.3). From that there arise multiple questions regarding the migration to and support in 3.0.
Problem 1: breaking changes in property names and the general structure
As this for 2.3 allows to use arbitrary properties from the file information, which might be renamed and restructured, it is not clear how to parse and support comments in existing files.
There are some options that come to mind:
- stay with the current set of properties from 2.3 and do not change --> a mapping needs to be maintained and not all features from 3.0 are available
- support both, names from 2.3 and 3.0 --> "old" parsers do not support everything and miss stuff or fail
- only support 3.0 --> this is breaking on both sides
- instead of
SPDX-useSPDX3-as prefix to not clash --> makes currently used tags invalid and would imply a migration? - add a version statement --> an additional line ...
Some Properties got moved
The license and copyright information was previously part of the file properties, but is expected to be expressed via relationships. This makes it more complex.
Problem 2: there is not yet a tag-value serialization
The current state of the model does not yet define a serialization, which could be used for defining the same concept on the level of SPDX3.
ping @mxmehl and @silverhook
stay with the current set of properties from 2.3 and do not change --> a mapping needs to be maintained and not all features from 3.0 are available
I like this approach. We would need to clearly document the mapping and compatibility. Perhaps even move this to a completely separate spec.
I’d be most in favour of either:
stay with the current set of properties from 2.3 and do not change --> a mapping needs to be maintained and not all features from 3.0 are available
or
instead of SPDX- use SPDX3- as prefix to not clash --> makes currently used tags invalid and would imply a migration?
But would need to understand how the mapping thing would work in practice. I’m a tiny bit wary of introducing yet another spec just to keep things afloat.
I personally believe that the information on how to denote SPDX information in files has to get a major overhaul -- and with @mxmehl and @silverhook we have discussed a number of changes. For example specifying how to add this information not only inside files but alongside files (useful for non-text files).
Just to give more concrete info on the current state: we have good ol' SPDX-License-Identifier: (Annex E) which we can all agree will not change.
Annex H also allows people to use inside files (in alphabetical order):
| File | Snippet |
|---|---|
1. SPDX-ArtifactOfProjectHomePage: |
17. SPDX-SnippetBegin |
2. SPDX-ArtifactOfProjectName: |
18. SPDX-SnippetEnd |
3. SPDX-ArtifactOfProjectURI: |
|
4. SPDX-FileAttributionText: |
|
5. SPDX-FileChecksum: |
19. SPDX-LicenseInfoInSnippet: |
6. SPDX-FileComment: |
20. SPDX-SnippetAttributionText: |
7. SPDX-FileContributor: |
21. SPDX-SnippetByteRange: |
8. SPDX-FileCopyrightText: |
22. SPDX-SnippetComment: |
9. SPDX-FileDependency: |
23. SPDX-SnippetCopyrightText: |
10. SPDX-FileName: |
24. SPDX-SnippetFromFileSPDXID: |
11. SPDX-FileNotice: |
25. SPDX-SnippetLicenseComments: |
12. SPDX-FileType: |
26. SPDX-SnippetLicenseConcluded: |
13. SPDX-LicenseComments: |
27. SPDX-SnippetLineRange: |
14. SPDX-LicenseConcluded: |
28. SPDX-SnippetName: |
15. SPDX-LicenseInfoInFile: |
29. SPDX-SnippetSPDXID: |
16. SPDX-SPDXID: |
I'm not even sure that all 29 of them are useful, so we might as well be explicit to what is allowed instead of a blanket "use anything by prepending SPDX-".
I think REUSE till now only mentions SPDX-FileCopyrightText (no. 8) above so we can definitely keep this one.
I'm not even sure that all 29 of them are useful, so we might as well be explicit to what is allowed instead of a blanket "use anything by prepending SPDX-".
Suggest aligning with the fields mentioned in Annex G SPDX Lite Fields.
@goneall none of the SPDX Lite fields are about Files or Snippets, so they cannot appear inside files.
none of the SPDX Lite fields are about Files or Snippets, so they cannot appear inside files.
Two thoughts:
- We include a union of the Lite fields with the file fields
- Alternatively, we include the file fields in the Lite definition
I’d be most in favour of either:
stay with the current set of properties from 2.3 and do not change --> a mapping needs to be maintained and not all features from 3.0 are available
or
instead of SPDX- use SPDX3- as prefix to not clash --> makes currently used tags invalid and would imply a migration?
But would need to understand how the mapping thing would work in practice. I’m a tiny bit wary of introducing yet another spec just to keep things afloat.
+1 to everything @silverhook said above.
I think REUSE till now only mentions SPDX-FileCopyrightText (no. 8) above so we can definitely keep this one.
IIRC, we use the following tags:
SPDX-License-IdentifierSPDX-FileCopyrightTextSPDX-FileContributor(viareuse annotate --contributor, will be in 2.x)SPDX-SnippetBeginSPDX-SnippetEndSPDX-SnippetCopyrightText
A year tomorrow since the last activity in this issue. Should downstream users continue to rely on SPDX 2.3 for in-file tagging (e.g., using SPDX-FileType instead of SPDX-SoftwarePurpose (and/or -ContentType))?
IIRC, SPDX-ArtifactOfProjetc* is deprecated (I suppose in favour of PURL and SWID), and it would be really useful to have a way to indicate the origin of the file or snippet (PURL/SWID don’t work, because they are limited to Package-level, so something would need to change in the spec or Annex H.).
We have currently released REUSE 3.0 and made it to still rely on SPDX 2.3. It would be great if we can eventually bump that to the newest SPDX spec, though.
Well, SPDXv2 was saying "you can use SPDX-tagname: value in files, where tagname is a tag defined in the tag-value format."
In SPDXv3 we don't have a tag-value format (yet?), so the same approach would not work.
My comment above was an attempt to codify which such entries exist, so that we can have them listed somewhere. It will most probably not be in the spec itself (unless we come up with a final version of an informative annex in the next days), but it might be in another document, or even in REUSE.