reuse-tool
                                
                                 reuse-tool copied to clipboard
                                
                                    reuse-tool copied to clipboard
                            
                            
                            
                        Do not parse DEP5 covered files
I have the situation where I have files which are Jinja2 templates to generate a project files, and reuse complains that he cannot parse it.
For example:
❯ mkdir projtest
❯ cd projtest
❯ git init
Alias tip: g init
Leeres verteiltes Git-Repository in …/projtest/.git/ initialisiert
❯ mkdir .reuse
❯ reuse download Apache-2.0
…/projtest/LICENSES/Apache-2.0.txt erfolgreich heruntergeladen.
❯ echo "# SPDX-FileCopyrightText: {{ cookiecutter.copyright }}\n# SPDX-License-Identifier: {{ cookiecutter.spdx_license_iden
tifier }}\n" > foo.py
❯ cat foo.py
# SPDX-FileCopyrightText: {{ cookiecutter.copyright }}
# SPDX-License-Identifier: {{ cookiecutter.spdx_license_identifier }}
❯ reuse lint
reuse._util - ERROR - Kann '{{ cookiecutter.spdx_license_identifier' nicht parsen
reuse.report - ERROR - Unerwarteter Fehler beim Parsen von 'foo.py' aufgetreten
KeyError: 'path'
# LESEFEHLER
Unlesbar:
* foo.py
# FEHLENDE URHEBERRECHTS- UND LIZENZINFORMATIONEN
Die folgenden Dateien haben keine Urheberrechts- und Lizenzinformationen:
* other.py
# ZUSAMMENFASSUNG
* Falsche Lizenzen:
* Veraltete Lizenzen:
* Lizenzen ohne Dateiendung:
* Fehlende Lizenzen:
* Unbenutzte Lizenzen:
* Verwendete Lizenzen: Apache-2.0
* Lesefehler: 1
* Dateien mit Urheberrechtsinformationen: 0 / 1
* Dateien mit Lizenzinformationen: 0 / 1
Leider ist Ihr Projekt nicht konform mit Version 3.0 der REUSE-Spezifikation :-(
And reuse complains about an reading error because he cannot parse the {{. It is intended that during the generation of the project {{ cookiecutter.spdx_license_identifier }} is replaced by a correct value. Genarlly I find it good to include it, so that the generated project from the template is directly reuse compliant.
So I cannot add the license information directly in the template files. To license the template files itself, I would add them to the DEP5 file:
❯ echo "Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/\n\nFiles: *.py\nCopyright: 2020 Foo Bar\nLicense: Apache-2.0\n" > .reuse/dep5
❯ cat .reuse/dep5
Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Files: *.py
Copyright: 2020 Foo Bar
License: Apache-2.0
❯ echo "print('hi')" > other.py
❯ reuse lint
reuse._util - ERROR - Kann '{{ cookiecutter.spdx_license_identifier' nicht parsen
reuse.report - ERROR - Unerwarteter Fehler beim Parsen von 'foo.py' aufgetreten
KeyError: 'path'
# LESEFEHLER
Unlesbar:
* foo.py
# ZUSAMMENFASSUNG
* Falsche Lizenzen:
* Veraltete Lizenzen:
* Lizenzen ohne Dateiendung:
* Fehlende Lizenzen:
* Unbenutzte Lizenzen:
* Verwendete Lizenzen: Apache-2.0
* Lesefehler: 1
* Dateien mit Urheberrechtsinformationen: 1 / 1
* Dateien mit Lizenzinformationen: 1 / 1
Leider ist Ihr Projekt nicht konform mit Version 3.0 der REUSE-Spezifikation :-(
However, reuse still tries to parse it, and the command fails (non zero exit code). Can resue probably ignore and do not parse files which are covered by DEP5?
Ah, it's always a problem when there are SPDX-FileCopyrightText/License-Identifier strings in the files that are not meant for REUSE.
The easiest way would be if you could split the string somehow in your template, so something like: print("SPDX-","FileCopyrightText")
I'm not sure whether there is another way in this case. @carmenbianca perhaps?
@mxmehl thanks, your idea is a good workaround, although pythons print() would insert a space between its components and is unsupported by my template engine anyway. But just using a string as a no-op works. So with the following reuse does not have a parsing error any more, and the template engine produces the correct output:
# {{ "SPDX" }}-FileCopyrightText: {{ cookiecutter.copyright }}
# {{ "SPDX" }}-License-Identifier: {{ cookiecutter.spdx_license_identifier }} 
Nevertheless, is it on purpose that files covered by DEP5 are still parsed?
This overlaps #253 . I think the biggest useful change would be to more clearly document the way in which searching for a license happens and in which order (.license file, imbedded in file, .reuse/dep5 ?last match?). AFAICT if .license exists then the file is not read; but if the file is in dep5, it is still processed looking for the tags inside itself (and will possibly raise an error). This is probably desirable in that otherwise, wildcards in dep5 might override valid information inside files, it just isn't clear that's what's going on (if it is).
I just encountered a similar problem. I’m including the source code of a third-party library in a project I maintain. The library uses the deprecated LGPL-3.0 identifier which reuse does not accept. I would like to overwrite these annotations in the dep5 file, but reuse still parses the library files and reports the incorrect license identifier.
Yes, we will have to define the precedence clearly. There is fsfe/reuse-docs#70 which proposes some options. Input is welcome!
Closing this as it is now a documentation issue: https://github.com/fsfe/reuse-docs/issues/70