licensed
licensed copied to clipboard
Collect more additional legal files
Currently licensed includes files matching:
LEGAL_FILES = /\A(AUTHORS|COPYING|NOTICE|LEGAL)(?:\..*)?\z/i
I think COPYING is superfluous, see #84, but there are some other files (also with any extension) that often include legal notices that would be useful to automatically collect, eg:
LICENSES,NOTICES(plural)THIRD-PARTY-LICENSE(variations including3rd, no delimiter, underscore delimiter, notice, plural)
I can probably suggest a semi-sane regex and tests for this, leaving this issue until then, or if anyone else wishes to comment or implement.
@mlinksva if you want to give that regex a shot that would be great, otherwise I'll try to get to this soon.
LICENSESTHIRD-PARTY-LICENSE
It's not clear... should these be found from licensee?
Might appear idiosyncratic, but no and yes...confirmed:
> Licensee::ProjectFiles::LicenseFile::name_score("LICENSES")
=> 0.0
> Licensee::ProjectFiles::LicenseFile::name_score("THIRD-PARTY-LICENSE")
=> 0.6
It probably makes sense to avoid latter match in licensee, now that you mention it.
Some projects use different files like hybernate using a lgpl.txt. This causes a lot of issues trying to detect dependent licenses problems. Could Dependabot warn to the most popular repos when there's no LICENSE file?