reuse-tool icon indicating copy to clipboard operation
reuse-tool copied to clipboard

feat: lint output per line

Open nicorikken opened this issue 10 months ago • 6 comments

Adds an '-line' or '-l' option to the 'lint' command. Prints a line for each error, starting with the file to which the error belongs.

This output can be a starting point for some parsers, in particular ones that implement something similar to Vim errorformat parsing.

Related to https://github.com/fsfe/reuse-tool/issues/925, needed for #578

Additional work needed:

  • [x] Needs tests
  • [ ] Error messages might have to be improved
  • [x] Not all errors are found, as some issues aren't in the FileReports. This requires additional investigation.

Signed-off-by: Nico Rikken [email protected]

nicorikken avatar Apr 10 '24 18:04 nicorikken

Current version works based on ProjectReport to ensure also license errors are included. A structural improvement might be to create structural reports for license errors, even as FileReports. And for file reports ideally the line number is kept to attribute errors to lines where possible.

nicorikken avatar Apr 20 '24 06:04 nicorikken

Two remaining suggestions for the future:

  • Include source code line numbers of the finding. Requires keeping track of this information when detecting
  • How best to deal with edge-case filenames that include colons and spaces?

Current output:

/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py: bad license: invalid
LICENSES/invalid-license-text: bad license: invalid-license-text
LICENSES/Nokia-Qt-exception-1.1.txt: deprecated license
LICENSES/MIT: license without file extension
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py: missing license: invalid
LICENSES/MIT: unused license
LICENSES/Nokia-Qt-exception-1.1.txt: unused license
LICENSES/invalid-license-text: unused license
/tmp/pytest-of-nico/pytest-44/fake_repository14/restricted.py: read error
/tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/no-license.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py: without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py: without copyright

Note: I have now committed a change to remove colons from the error messages to aid in parsing.

nicorikken avatar Apr 22 '24 05:04 nicorikken

I'm now trying to validate with the errorformat library: errorformat "%f: %m" It seems it cannot succesfully parse the output of files with spaces and colons:

/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py|| bad license invalid
LICENSES/invalid-license-text|| bad license: invalid-license-text
LICENSES/Nokia-Qt-exception-1.1.txt|| deprecated license
LICENSES/MIT|| license without file extension
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py|| missing license invalid
LICENSES/MIT|| unused license
LICENSES/Nokia-Qt-exception-1.1.txt|| unused license
LICENSES/invalid-license-text|| unused license
/tmp/pytest-of-nico/pytest-44/fake_repository14/restricted.py|| read error
|| /tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py|| without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/no-license.py|| without license
|| /tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py|| without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py|| without copyright

So probably filenames will have to be escaped.

I tried rendering output for SARIF, but it seems that it requires a built-in format and that all compatible built-in formats require line numbers and column numbers.

nicorikken avatar Apr 22 '24 06:04 nicorikken

I originally had a test with a filename with colons, but this was causing errors on Windows. I removed it.

nicorikken avatar Apr 22 '24 19:04 nicorikken

The issue of using files with spaces seems to be an issue with the errorformat library: https://github.com/reviewdog/errorformat/issues/97 I don't think reuse-tool needs to escape filename output as there is already a clear separator in this output. This leaves the edge-case of filenames with colons, but this is also rare in a development environment that will have to cater to Windows which doesn't allow colons in filenames.

nicorikken avatar Apr 23 '24 03:04 nicorikken

I also spent a while looking at escaping options. Thank you @nicorikken for your analysis. I think there are only two characters that really concern us here:

  • Newlines in filenames (yes...)
  • Colons in filenames

I gave it a spin with Pylint to see how other tools handle it. Here's the result:

$ pylint src/reuse
************* Module reuse.hello
world
src/reuse/hello
world.py:1:0: C0103: Module name "hello
world" doesn't conform to snake_case naming style (invalid-name)
************* Module reuse.hello:world
src/reuse/hello:world.py:1:0: C0103: Module name "hello:world" doesn't conform to snake_case naming style (invalid-name)

The answer is: don't.

I think we can be lazy about this and not bother.

carmenbianca avatar Apr 27 '24 10:04 carmenbianca