reuse-tool icon indicating copy to clipboard operation
reuse-tool copied to clipboard

reuse scores poorly on ClearlyDefined

Open carmenbianca opened this issue 5 years ago • 6 comments

ClearlyDefined, for the purposes of this PR, is a project that crawls GitHub/PyPI/NPM/wherever for projects/packages, and gives them a rating based on how well they do their licensing. They publish the ratings on their website.

Oddly, reuse scores extremely poorly. See:

Ekrankopio de 2020-02-10 11-04-52

I doubt that there is much that can be done from reuse's POV; the issue appears to lie with ClearlyDefined's crawler. I submitted a curation at https://github.com/clearlydefined/curated-data/pull/3477, but that is obviously not the best solution.

carmenbianca avatar Feb 10 '20 11:02 carmenbianca

Strange indeed, thanks for looking it up. Is this because REUSE puts license texts in LICENSES/ which is not understood by ClearlyDefined?

mxmehl avatar Feb 10 '20 11:02 mxmehl

I could identify the following things:

  • LICENSES/ directory is not recognised
  • setup.py is not correctly searched for the overall license
  • setup.py/PyPI is not correctly searched for the source location
  • SPDX-FileCopyrightText is not supported
  • A lot of false positives are found because of the nature of this project.

And probably some other things maybe. I don't know.

carmenbianca avatar Feb 10 '20 11:02 carmenbianca

ClearlyDefined uses also FOSSology. Probably best/easiest to ask them to enable FOSSology’s Ojo as well …or add REUSE tool as one of the scanners they use.

silverhook avatar Feb 18 '20 13:02 silverhook

Opened an issue here that I could identify:

https://github.com/clearlydefined/crawler/issues/372

There also exists a FOSSology issue here:

https://github.com/fossology/fossology/issues/1592

and here for ScanCode:

https://github.com/nexB/scancode-toolkit/issues/1816

carmenbianca avatar Feb 21 '20 15:02 carmenbianca

There is ongoing work to make ClearlyDefined understand REUSE, so the situation might improve soon!

clearlydefined/service#883

mxmehl avatar Dec 14 '21 09:12 mxmehl

Yes, though I've been pretty conservative with applying data from REUSE to the overall results, you already see clear improvements on my dev instance:

image

There is still some potential for improvement, e.g. need to debug why the date couldn't be found here and the license texts from the LICENSES folder still need to be taken over properly (therefore the 0/15 here), but that should be doable.

SebastianWolf-SAP avatar Dec 14 '21 14:12 SebastianWolf-SAP