reuse-tool icon indicating copy to clipboard operation
reuse-tool copied to clipboard

Detect wrong and provide suggestions for wrong identifiers in addheader

Open hoijui opened this issue 3 years ago • 5 comments

When a BAD license is detected, suggest similarly named, valid licenses, if any. For example:

  • BAD license: CC0
  • Suggest: CC0-1.0

hoijui avatar Jun 09 '21 10:06 hoijui

Later I noticed, That there are suggestions like that (I got one when trying a "BSD" license). I guess it could be improved still - see my example above: CC0 does not result in a suggestion.

hoijui avatar Jun 11 '21 16:06 hoijui

I agree, similarly there is the distinction between GPL-3.0 and GPL-3.0-or-later. If you naively type GPL-3.0 you might be registering the wrong license. Considering this is all user-input, some validation would be nice. I'm pretty sure there is some Python fuzzy text matching library available to find similar identifiers to ask the user to confirm.

nicorikken avatar Jun 11 '21 18:06 nicorikken

This was implemented in #152. Originally the suggestions were better but adding a dependency for the fuzzy string matching library was not possible, so the functionality was simplified. Currently the suggestions work well only on typos such as GLP-3.0-ro-ltaer.

I agree that the suggestion should be improved. Is there any fuzzy string matching library we could use? Otherwise, the current solution could be modified to match prefixes so that inputting GPL-3.0 would suggest GPL-3.0-only and GPL-3.0-or-later for instance.

siiptuo avatar Sep 21 '21 16:09 siiptuo

Oh, I almost forgot the feature you've once added, @siiptuo. Wouldn't that already be a great start for a check in addheader, especially after the merge of #416?

mxmehl avatar Sep 28 '21 13:09 mxmehl

I renamed this issue to track the effort of adding the functionality that we've had since #416 also for addheader. Currently there is no sanity check at all.

mxmehl avatar Jan 22 '22 13:01 mxmehl