reuse-tool icon indicating copy to clipboard operation
reuse-tool copied to clipboard

Inconsistency between the tool behaviour and the spec

Open walpox opened this issue 2 years ago • 3 comments

reuse CLI version: 2.1.0. Platform: Windows 10 x86-64.

Version 3.0 of the REUSE specification states the following:

Comment headers

To implement this method, each plain text file that can contain comments MUST contain comments at the top of the file (comment header) that declare that file’s Copyright and Licensing Information.

If a file is not a plain text file or does not permit the inclusion of comments, the comment header that declares the file’s Copyright and Licensing Information SHOULD be in an adjacent file of the same name with the additional extension .license (example: cat.jpg.license if the original file is cat.jpg).

According to the first sentence in bold, the reuse header can only be present in-file when it starts at the top, presumably at line 1. However, in one of the projects I am contributing to, Zola is used to generate a static website and, this site generator, requires what's called a TOML front matter at the top of content Markdown files to build without errors.

If no reuse information is present in one of these files and the reuse annotate command is used, information will be added at line 1 and Zola will fail to build. That's expected. However, if the reuse information was added (e.g. manually) after the TOML front matter, the reuse annotate tool will modify that reuse block fine.

For example:

+++
title = "My blog"
author = ["John Doe"]
+++

<!--
SPDX-FileCopyrightText: 2023 John Doe
-->

This is my awesome blog! Hope you enjoy it!

If you run the following command in a PowerShell terminal:

reuse annotate --merge-copyrights `
--license "Apache-2.0" `
--copyright "Jane Doe" `
.\content\myfile.md

The reuse CLI tool will change the file to:

+++
title = "My blog"
author = ["John Doe"]
+++

<!--
SPDX-FileCopyrightText: 2023 John Doe
SPDX-FileCopyrightText: 2023 Jane Doe

SPDX-License-Identifier: Apache-2.0
-->

This is my awesome blog! Hope you enjoy it!

If the specification states that the reuse header MUST be at the top of the file, I would expect the reuse annotate tool to give either a warning or an error in these cases. Otherwise, a more suitable interpretation for the specification would be SHOULD instead of MUST.

Using separate *.license files is what we decided to do. This type of Markdown files allows the inclusion of comments, so the second sentence in bold in the quotation could be clarified to state use of separate *.license files for those files that do not permit inclusion of comments at the top of their contents.

walpox avatar Nov 25 '23 09:11 walpox

I'm not sure whether we ever discussed that, but I am pretty sure we've never meant to make it mandatory to have the headers starting at line 1. At least that's how we live it as well, and the tool itself - as a reference implementation of the spec - even supports adding these headers in subsequent lines, e.g. with bash shebangs or if a comment header already exists.

What we live in practice is to have the comment headers somewhere at the top, so the topmost comment section. This could even be line 20 like with curl. The rationale is that the REUSE information must be understandable (and also findable) for humans as well. So hiding it at the bottom of the file is not an option.

@carmenbianca, if you agree to my analysis, we could improve the wording in the spec.

mxmehl avatar Nov 27 '23 10:11 mxmehl

I am of the opinion that any specification benefits from having as little ambiguity as possible. Hopefully, newer versions of the REUSE spec will keep this fact in consideration.

walpox avatar Nov 27 '23 13:11 walpox

This is a spec issue, yes. A few thoughts that aren't well-connected:

  • The specification is correct in that the header should be at the top of the file. Putting the copyright notices at the bottom of the file would be against the spec, even if the tool doesn't (presently) check for this.
  • 'The top of the file' is rather poorly specified. It should be something akin to 'as close to the beginning of the file as possible', which is vague and not very helpful, but less incorrect.
  • There exists no good/reliable way to check whether the header is 'as close to the beginning of the file as possible', ergo the tool doesn't really check for this.

But let's indeed fix this in https://github.com/fsfe/reuse-docs/pull/133

carmenbianca avatar Nov 27 '23 14:11 carmenbianca