scancode.io icon indicating copy to clipboard operation
scancode.io copied to clipboard

An app to detect licenses from the provided input license text

Open lf32 opened this issue 2 years ago • 55 comments

The aim of this PR is to create a working app inside this Django project to detect licenses from the provided input text and summarize the results for the end user.

lf32 avatar Jun 16 '22 05:06 lf32

@lf32 That's a good start, see my comments for minor improvements.

In the future you need to provide more context than simply requesting a review. There's a tons of commented code, which part do you need to be reviewed? Include what you have accomplished and what is left to do, also include the issues you are facing.

thanks, i pinged you to let you know about the PR

  • the scancode.api.get_licenses accepts files as input but not text, what do you think of it?
  • do you think the input text should be stored as a file and then pass it as the argument?

lf32 avatar Jun 16 '22 07:06 lf32

You can use a tempfile for now or adapt the code to use match with the query_string argument?

Still checking this! will let you know ASAP

tdruez avatar Jun 16 '22 08:06 tdruez

Screenshot 2022-06-16 at 12 15 58

You are putting side by side a form submit button "Scan License" and a navigation link "New Project".

I think the "New Project" as no need to be present in that view. Also, the action button "Scan License" would make more sense to be located after the form, not on top of it. The textarea is simply too big, reduce the height.

tdruez avatar Jun 16 '22 08:06 tdruez

@tdruez please suggest changes and also any resource for recommending license?

res_license

lf32 avatar Jun 19 '22 05:06 lf32

Screenshot 2022-06-16 at 12 15 58

You are putting side by side a form submit button "Scan License" and a navigation link "New Project".

I think the "New Project" as no need to be present in that view. Also, the action button "Scan License" would make more sense to be located after the form, not on top of it. The textarea is simply too big, reduce the height.

res_license

How is this? rows are set to 25, isn't it ok?

lf32 avatar Jun 19 '22 05:06 lf32

please suggest changes

This looks awesome but where is the code? No code, no review possible.

any resource for recommending license?

I don't know what you mean by that.

tdruez avatar Jun 20 '22 05:06 tdruez

This looks awesome but where is the code? No code, no review possible.

I have just pushed changes over here

lf32 avatar Jun 20 '22 06:06 lf32

@lf32 please read the following for future commit messages (no need to update the existing ones): https://aboutcode.readthedocs.io/en/latest/contributing/writing_good_commit_messages.html

For example: "changed card title from name to short_name" should be "Change card title from name to short_name #450"

You can look at the commit history of the repo and try to follow the conventions in place.

tdruez avatar Jun 20 '22 07:06 tdruez

should the user have a checkbox to decide if they have to detect unknown licenses or should we just set it to true by default

for the above license the percentage is 99.21 when its true, but 97.91 when set to False

lf32 avatar Jun 21 '22 16:06 lf32

should the user have a checkbox to decide if they have to detect unknown licenses or should we just set it to true by default

It's not yet the time to make those kind of advance refinements. We want to display everything for now.

tdruez avatar Jun 21 '22 16:06 tdruez

@lf32 your recent changes in get_licenses breaks the detection of simple content such as "Apache 2.0". Is this behavior expected?

tdruez avatar Jun 22 '22 08:06 tdruez

@lf32 your recent changes in get_licenses breaks the detection of simple content such as "Apache 2.0". Is this behavior expected?

Yes I guess, when tempfile is not flushed with temp_file.flush() I think the file doesn't get written to disk and get_licenses cant read it from the beginning

lf32 avatar Jun 22 '22 10:06 lf32

Yes I guess, when tempfile is not flushed with temp_file.flush() I think the file doesn't get written to disk and get_licenses cant read it from the beginning

Keep the .flush() and add a comment saying that the flush is require to ensure that the content is written to the disk before it's read by get_licenses

tdruez avatar Jun 22 '22 11:06 tdruez

Also, you haven't answer to my questions from my previous comment:

  • Also, what's the deal with all the force-pushed in your commit history? This is a source of issues when fetching your changes on my side. Try to avoid those rewriting in the commits history.

  • Remove the models.py file and migrations/ directory. Not needed for now.

  • It would be nice to highlight the "matched text" on the left side when I open the match details on the right side. Does that feature make sense to you?

tdruez avatar Jun 22 '22 11:06 tdruez

@lf32 Good progress!

Some feedback:

Thanks, for the feedback :)

* Commit messages look better but there's no need to uppercase every single words.
  A sentence starts with an upper case then each words is lower case.
  Please have a look at the commit history of the repo and try to be consistent with the existing style.

* Also, what's the deal with all the `force-pushed` in your commit history?
  This is a source of issues when fetching your changes on my side. Try to avoid those rewriting in the commits history.

I have force pushed in the past because I was trying to make commits look better and, in a few other cases, I had to amend simple changes to the last commit which was pushed to github.

* Remove the `models.py` file and `migrations/` directory. Not needed for now.

lost the commit during a hard reset

* It would be nice to highlight the "matched text" on the left side when I open the match details on the right side. Does that feature make sense to you?

Is it like how demo.grammarly.com does? It sounds great and I'm working on it

lf32 avatar Jun 22 '22 13:06 lf32

Is it like how demo.grammarly.com does? It sounds great and I'm working on it

That UI looks pretty good, I like that we could use different colors based on the criticality of the various matches.

tdruez avatar Jun 22 '22 14:06 tdruez

Is it like how demo.grammarly.com does? It sounds great and I'm working on it

That UI looks pretty good, I like that we could use different colors based on the criticality of the various matches.

based on score?

lf32 avatar Jun 22 '22 17:06 lf32

based on score?

I was thinking more about License Policies and Compliance Alerts

You can start using the score for the initial implementation though, since the value is already available in your template. We can improve using the policies later.

tdruez avatar Jun 23 '22 05:06 tdruez

@tdruez what do you think about github's changelog?

lf32 avatar Jun 29 '22 06:06 lf32

@tdruez what do you think about github's changelog?

@lf32 In which context? I'm not familiar with it.

tdruez avatar Jun 29 '22 06:06 tdruez

Displaying detected license in the way changelog looks, instead of highlighting the text.

lf32 avatar Jun 29 '22 06:06 lf32

@lf32 please explain why you think this UI would be better.

tdruez avatar Jun 29 '22 06:06 tdruez

In the previous case user has to click on the card's drop down, here user has to just scroll and the license name can be seen on every matched line.

Besides that, there is an issue in detecting matched text from the html because the matched result is actually altered from the original one.

lf32 avatar Jun 29 '22 07:06 lf32

res_license

lf32 avatar Jun 29 '22 09:06 lf32

In the previous case user has to click on the card's drop down

Please be explicit with what you mean by "previous case", it's not clear.

Besides that, there is an issue in detecting matched text from the html because the matched result is actually altered from the original one

This is an issue and it can probably be solved. Changing the end results because your are facing a difficulty is generally not a good enough reason.

Actually, it would be nice to have both options in separate tabs:

  • The list and matched text like presented in your screenshot at https://github.com/nexB/scancode.io/pull/450#issuecomment-1169724197 as "Details"
  • The full input text with all the matchs highlighted to view all the matches in the context of the whole text.

tdruez avatar Jun 29 '22 14:06 tdruez

Please be explicit with what you mean by "previous case", it's not clear.

sure!

This is an issue and it can probably be solved. Changing the end results because your are facing a difficulty is generally not a good enough reason.

:sweat_smile: Would you please help me with this? I have some issues.

The list and matched text like presented in your screenshot at ... as "Details"

can we discuss this, I have few questions (on gitter)

lf32 avatar Jun 30 '22 10:06 lf32

😅 Would you please help me with this? I have some issues.

Sure, can you try to explain the problem, and present some example?

can we discuss this, I have few questions (on gitter)

Why not here instead of gitter?

tdruez avatar Jul 01 '22 09:07 tdruez

@lf32 could you provide a quick status of your progress? What is now complete, what you are currently working on, and what is left todo.

Cheers!

tdruez avatar Jul 04 '22 07:07 tdruez

Progress Update

  • Developed Views
  • Tweaking around Templates
  • Have to discuss on what's next to do.

lf32 avatar Jul 04 '22 07:07 lf32

@tdruez I am thinking of using ace editor instead of manually designing the input text div (left column).

Which one do you think is having a better view?

1:1 2:1
1 2

lf32 avatar Jul 04 '22 12:07 lf32