scancode-toolkit icon indicating copy to clipboard operation
scancode-toolkit copied to clipboard

Detect "All rights reserved." on subsequent line

Open vw-anton opened this issue 1 year ago • 4 comments

Description

We have found a scenario where the copyright is stated on two consecutive lines:

https://github.com/d3/d3-shape/blob/v1.3.7/LICENSE

Copyright 2010-2015 Mike Bostock All rights reserved.

However ScanCode only reports the first line:

{
  "path": "d3-shape-1.3.7/LICENSE",
  "type": "file",
  "name": "LICENSE",
  "status": "application-package",
 ...
  "copyrights": [
    {
      "end_line": 1,
      "copyright": "Copyright 2010-2015 Mike Bostock",
      "start_line": 1
    }
  ],
 ...

How To Reproduce

Run scancode and check result.

System configuration

scancode.io with pkg:pypi/[email protected]

vw-anton avatar Apr 17 '24 08:04 vw-anton

Thanks for the report. Both cases (on one line or two lines) would report the same copyright statement and report this as being only on a single line.

The "All right reserved" is used in the detection but never returned ... See https://github.com/nexB/scancode-toolkit/blob/04e24e0c7edccaa27ad2cf495e15abb849a922c9/src/cluecode/copyrights.py#L87 as we have some code that started dealing with this but is not fully functional, finished.

pombredanne avatar Apr 17 '24 08:04 pombredanne

Hi Philippe,

our policy is to preserve the copyright as it was identified. And we therefore activated the feature your are referring to using include_copyright_allrights=true. When you say that the feature is not fully functional / finished, what do exactly mean with this. Our tests have been promising so far and we activated the feature as it produces much better results compared to the default configuration.

Could you elaborate in detail on the implementation status?

What is required to move forward.

Kind regards, Karsten

karsten-klein avatar Sep 05 '24 08:09 karsten-klein

@karsten-klein Hi! ... Could you contribute your tests?

pombredanne avatar Sep 05 '24 10:09 pombredanne

We are not too Python addicted. We could agree on the testdata specification (organization of input and expectation) we could share the data. If you spent the test setup we should be able to to it. Or do you have already a layout in mind, that we can use?

karsten-klein avatar Sep 05 '24 11:09 karsten-klein