reuse-tool icon indicating copy to clipboard operation
reuse-tool copied to clipboard

The tool currently very poorly deals with erroneous SPDX expressions

Open carmenbianca opened this issue 5 years ago • 7 comments

Given a file erroneous-spdx.txt:

SPDX-Copyright: Carmen

SPDX-License-Identifier: MIT OR BSD AND

The output of reuse lint is:

reuse._util - ERROR - Could not parse 'MIT OR BSD AND'
reuse.project - ERROR - erroneous-spdx.txt holds an SPDX expression that cannot be parsed, skipping the file
NO LICENSE

The following files have no license(s):
  erroneous-spdx.txt

NO COPYRIGHT

The following files have no copyright:
  erroneous-spdx.txt

SUMMARY

Bad licenses: 0
Missing licenses: 0
Unused licenses: 0
Used licenses: Apache-2.0, CC-BY-SA-4.0, CC0-1.0, GPL-3.0-or-later
Read errors: 0
Files with copyright information: 47 / 48
Files with license information: 47 / 48

The ERROR statements are just logger output from within the program. The file is then completely skipped over, and its (completely valid) SPDX-Copyright tag is ignored.

Is this sufficient, or should the plumbing somehow change to account for this edge case?

carmenbianca avatar Apr 18 '19 15:04 carmenbianca

I think it's OK to let this be an error. We should strongly discourage erroneous SPDX expressions to make reuse of software not a guesswork but easy and unambiguous.

See the stupid edge cases Thomas is dealing with in Linux for an example how things can explode to a massive rework even with minor errors ;)

mxmehl avatar May 21 '19 12:05 mxmehl

Can you give me more info on what Thomas is currently dealing with? Maybe an article or e-mail I can read.

carmenbianca avatar May 21 '19 15:05 carmenbianca

Sure. It's being discussed on the [email protected] mailing list. First post

mxmehl avatar May 24 '19 10:05 mxmehl

This bug is referred to by the documentation in #80. When this bug is fixed, the documentation should reflect that.

carmenbianca avatar Aug 29 '19 11:08 carmenbianca

Somewhat related: reuse addheader allows to add any string you want as a license, e.g.

reuse addheader foobar/__init__.py --license GPLv33

Shouldn't the tool report this as an invalid license identifier and abort the operation, similar to how reuse init behaves?

bittner avatar Dec 20 '20 19:12 bittner

As I felt the issue raised by @bittner is quite specific, I forked it off into a separate issue.

nicorikken avatar Jun 11 '21 20:06 nicorikken

In the example in #463, people will see the following error, even if the block ignore is implemented:

reuse._util - ERROR - Could not parse 'MIT" > file.txt'
reuse.project - ERROR - 'foobar.sh' holds an SPDX expression that cannot be parsed, skipping the file

The suggestion is to make this error more understandable and solvable for users:

  1. Collect these errors, and only display them near the summary block
  2. Combine and explain these errors in a better fashion, e.g. "The files contain text strings that confuse the REUSE tool. It cannot reliably understand what's the actual license and/or copyright. Please see $URL for an explanation and solution."
  3. Create the FAQ item ($URL) that explains the source of problem, and that people shall wrap the problematic lines in the block ignores (#463).

mxmehl avatar Jan 24 '22 09:01 mxmehl