licensir icon indicating copy to clipboard operation
licensir copied to clipboard

Support guessing from README

Open cybrox opened this issue 5 years ago • 5 comments

As far as I can see, there is no implementation or work-in-progress for guessing information based on licenses pasted into README or README.md, is this correct?

Especially in Elixir, I've seen a lot of projects do this and I think it should be supported. Would be ready to write a PR for that functionality, if desired.

cybrox avatar Jun 24 '19 14:06 cybrox

Hi @cybrox! No, we don't have that yet. This tool as it is now only checks from mix.exs and LICENSE files. Feel free to open a PR!

unnawut avatar Jun 25 '19 07:06 unnawut

I thought this would be quite tricky but it seems most maintainers that put the license information into the README just copy the whole license text in as well.

A pragmatic solution would be to solve this like #12 but I don't quite like it. Do you think there should be a kind of "scoring" system, where the LICENSE* files are more important than anything that might happen to be in a README or should they all count the same?

cybrox avatar Jun 27 '19 06:06 cybrox

I havn't come across a library that dumps the license text into README though. Usually I see a one-liner:

[repo name] is released under [license name].

What this library does right now is it makes no assumptions regarding which source is more correct. But inform the user of the discrepancies, e.g.

Unsure (found: Apache 2.0, Apache 2)

This is because I think the risk of deducing the wrong license is too high and it's much better/safer to fix the discrepancy at the source.

Happy to hear more ideas, but my opinion would be to detect the one-liners and do a mapping something like this? https://github.com/unnawut/licensir/blob/master/lib/licensir/naming_variants.ex

unnawut avatar Jun 27 '19 07:06 unnawut

There are quite a few that do it, like ex_aws, decimal, mix-test.watch, ..., so I think matching for the whole license text should be supported as well.

I would definitely like to support one-liners and things like the example below. I think using a mapping like the naming variants would work very well.

### License
MIT

cybrox avatar Jun 27 '19 07:06 cybrox

Oh wow. Thanks for the examples. I've used those before and didn't realise they put the full text there (granted MIT is short enough to do that). Looks good to me to try match them then.

If you also agree that we should inform discrepancies rather than self-deducing, feel free to push this ahead!

unnawut avatar Jun 27 '19 07:06 unnawut