ort icon indicating copy to clipboard operation
ort copied to clipboard

Allow to curate copyrights directly (not via authors)

Open sschuberth opened this issue 4 years ago • 5 comments

adding the Copyright holder statements to the org.ossreviewtoolkit.model.PackageCurationData entity

FYI, I just pushed my local curate-copyrights branch which trivially starts doing that.

@rbieniek do you want to take over that branch of mine?

Originally posted by @sschuberth in https://github.com/oss-review-toolkit/ort/issues/4463#issuecomment-931138189

sschuberth avatar Sep 30 '21 09:09 sschuberth

Being able to curate copyright holders would be really useful but are you thinking of implementing it in curations.yml or package configurations?

I asking as I have the following case, for Maven:org.apache.activemq:activemq-broker:5.16.2 the root LICENSE e.g. https://github.com/apache/activemq/blob/ff1af27106c74ad930c5bd12e8c0159e522efb70/LICENSE include licenses applicable for other activemq packages but not activemq-broker.

Wanted to fix this via a package configuration instead curations.yml but then I figured out that I can remove non-applicable detected licenses (BSD-3-Clause, CC-BY-2.5, CC-BY-SA-2.5, LicenseRef-scancode-cc-devnations-2.0, LicenseRef-scancode-ekioh, MIT, NOASSERTION) but I can't remove none applicable copyright statements.

I see several ways on how we can handle this case A) If you remove a detected license by concluding the license (via package configuration or curations.yml) then associated copyright holders for licenses not in concluded will be removed. In this way ,ORT users do not have to make a lot of copyrights curations but can focus on licenses. B)Above A) does not cover the case where copyrights are not associated to a license so we need a mechanism to other allow add and remove copyrights and/or associate copyrights to a license.

Adding this topic to ORT developer meeting agenda to reach consensus on the best way to implement curating copyright holders.

tsteenbe avatar Oct 07 '21 09:10 tsteenbe

@rbieniek do you want to take over that branch of mine?

Ping @porsche-rbieniek and @porsche-rishisaxena as a reminder to move this forward.

sschuberth avatar Apr 21 '22 08:04 sschuberth

Porsche solution submitted as https://github.com/oss-review-toolkit/ort/pull/5315

porsche-rbieniek avatar May 04 '22 10:05 porsche-rbieniek

Porsche solution submitted as #5315

Please associate issues with PRs by using one of the respective keywords in one of the commits in the PR instead of manually adding comments.

sschuberth avatar May 06 '22 06:05 sschuberth

Copy-paste from July 14th, 2022 ORT developer meeting minutes in which we had a discussion on how to implement curating declared copyrights

A. Use curations to fix-up curate copyrights

The declared copyrights comes package metadata collected by ORT analyzer which can be fixed up using curations. Note that a lot of package managers do not have copyrights fields only author, contributors, developers which in some case are "mis-used" to convey copyright information. We could implement a parseCopyrights() to from parseAuthors() find copyright statements e.g. entries starting with "copyright" or "(c)"

Below several ideas for how curating declared licenses in curations.yml could look like.

Remove a declared copyright

declared_copyright_mapping:
   "MIT": ""

Add a declared copyright

declared_copyright_mapping:
   "": "Copyright (C) 2022 John Doe"

Overwrite all declared copyrights with a single one:

declared_copyright_mapping:
   "*": ""
   "": "Copyright (C) 2022 John Doe"

Overwrite all declared copyrights with a more than one:

declared_copyright_mapping:
   "*": ""
   "": "Copyright (C) 2022 John Doe"
   "": "Copyright (C) 2019 Jane Doe"
   "": "Copyright (C) 2012 Example, Inc"

Associate declared licenses with declared copyrights:

declared_license_copyrights_mapping:
   "MIT": "Copyright (C) 2022 John Doe"
   "MIT": "Copyright (C) 2019 Jane Doe"
   "Apache-2.0": "Copyright (C) 2012 Example, Inc"

B. Use package configurations to curate detected copyrights

Introduce a copyright_finding_curation in package configurations to curate detected copyrights:

id: "NPM::ansi-styles:4.2.1"
copyright_finding_curations:
  - path: "README.md"
    start_lines: "3"
    line_count: 11
    detected_copyright: ""
    reason: "INCORRECT"
    comment: "Copyright only written on project website."
    concluded_copyright: "Copyright (C) 2022 John Doe"

d. Associate detected licenses to detected copyrights in a package configuration

id: "NPM::ansi-styles:4.2.1"
license_copyright_finding_curation:
  - license: "MIT"
    copyrights:
       - "Copyright (C) 2022 John Doe"
       - "Copyright (C) 2019 Jane Doe"
       - "Copyright (C) 2012 Example, Inc."

C. Use concluded copyright to overwrite both declared and detected

Introduce the concept of a concluded copyright? Should we introduce this, believe concluded_license should be removed from curations as it applies to both declared and detected licenses

concluded_copyright: "Copyright (C) 2022 John Doe"

tsteenbe avatar Jul 14 '22 11:07 tsteenbe