reuse-tool
reuse-tool copied to clipboard
feat: lint output per line
Adds an '-line' or '-l' option to the 'lint' command. Prints a line for each error, starting with the file to which the error belongs.
This output can be a starting point for some parsers, in particular ones that implement something similar to Vim errorformat parsing.
Related to https://github.com/fsfe/reuse-tool/issues/925, needed for #578
Additional work needed:
- [x] Needs tests
- [ ] Error messages might have to be improved
- [x] Not all errors are found, as some issues aren't in the FileReports. This requires additional investigation.
Signed-off-by: Nico Rikken [email protected]
Current version works based on ProjectReport to ensure also license errors are included. A structural improvement might be to create structural reports for license errors, even as FileReports. And for file reports ideally the line number is kept to attribute errors to lines where possible.
Two remaining suggestions for the future:
- Include source code line numbers of the finding. Requires keeping track of this information when detecting
- How best to deal with edge-case filenames that include colons and spaces?
Current output:
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py: bad license: invalid
LICENSES/invalid-license-text: bad license: invalid-license-text
LICENSES/Nokia-Qt-exception-1.1.txt: deprecated license
LICENSES/MIT: license without file extension
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py: missing license: invalid
LICENSES/MIT: unused license
LICENSES/Nokia-Qt-exception-1.1.txt: unused license
LICENSES/invalid-license-text: unused license
/tmp/pytest-of-nico/pytest-44/fake_repository14/restricted.py: read error
/tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/no-license.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py: without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py: without copyright
Note: I have now committed a change to remove colons from the error messages to aid in parsing.
I'm now trying to validate with the errorformat library: errorformat "%f: %m"
It seems it cannot succesfully parse the output of files with spaces and colons:
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py|| bad license invalid
LICENSES/invalid-license-text|| bad license: invalid-license-text
LICENSES/Nokia-Qt-exception-1.1.txt|| deprecated license
LICENSES/MIT|| license without file extension
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py|| missing license invalid
LICENSES/MIT|| unused license
LICENSES/Nokia-Qt-exception-1.1.txt|| unused license
LICENSES/invalid-license-text|| unused license
/tmp/pytest-of-nico/pytest-44/fake_repository14/restricted.py|| read error
|| /tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py|| without license
/tmp/pytest-of-nico/pytest-44/fake_repository14/no-license.py|| without license
|| /tmp/pytest-of-nico/pytest-44/fake_repository14/file with spaces.py: without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/file:with:colons.py|| without copyright
/tmp/pytest-of-nico/pytest-44/fake_repository14/invalid-license.py|| without copyright
So probably filenames will have to be escaped.
I tried rendering output for SARIF, but it seems that it requires a built-in format and that all compatible built-in formats require line numbers and column numbers.
I originally had a test with a filename with colons, but this was causing errors on Windows. I removed it.
The issue of using files with spaces seems to be an issue with the errorformat
library: https://github.com/reviewdog/errorformat/issues/97 I don't think reuse-tool needs to escape filename output as there is already a clear separator in this output.
This leaves the edge-case of filenames with colons, but this is also rare in a development environment that will have to cater to Windows which doesn't allow colons in filenames.
I also spent a while looking at escaping options. Thank you @nicorikken for your analysis. I think there are only two characters that really concern us here:
- Newlines in filenames (yes...)
- Colons in filenames
I gave it a spin with Pylint to see how other tools handle it. Here's the result:
$ pylint src/reuse
************* Module reuse.hello
world
src/reuse/hello
world.py:1:0: C0103: Module name "hello
world" doesn't conform to snake_case naming style (invalid-name)
************* Module reuse.hello:world
src/reuse/hello:world.py:1:0: C0103: Module name "hello:world" doesn't conform to snake_case naming style (invalid-name)
The answer is: don't.
I think we can be lazy about this and not bother.