Enable more fine granular definition of curations
Overview
As somebody responsible for creating curations I want to have the possibility to write curations in a way that allow the removal and addition of single licenses and copyright entries instead of having to redefine the list of licenses or copyright entries completely. While this possibility does not directly give a benefit when curating a single component it enables to easier transfer curations e.g. to other versions of the the same component.
Proposal of approach for ADD/DELETE operations
The given approach works on the Scancode input data. This introduces some coupling to the scancode data model but avoids coupling to the ComponentInfo data model. Working on the input data model gives some fine granular control and enables to write curations rules which avoid being triggered to broadly.
DELETE of Licenses
Deleting found licenses is done by defining rules which result in ignoring the license finding(s) of scancode rules in files within the scanned codebase. The following "conditions" are used for defining the rule
-
pathof the file within the sources (defined as a regular expression) -
identifierof the rule (defined as a regular expression) -
matchedTextof the finding (defined as a regular expression)
This kind of curations is independent of the ComponentInfo data model but introduces a coupling to the scancode data model / rules.
ADD of License
Adding new licenses is done by defining rules which add new license info (to the licenses found in a source file) - or "on top level".
Conditions:
-
pathof the file within the sources (defined as a regular expression; if omitted the license will be applied on "top level")
Data:
-
license: the spdxid of the license to add -
url: URL to the license text
DELETE of Copyrights
Deleting found copyrights is done by defining rules which result in ignoring the copyright finding(s) in files within the scanned codebase. The following "conditions" are used for defining the rule
-
pathof the file within the sources (defined as a regular expression) -
copyrightthe found copyright text to ignore (defined as a regular expression)
ADD of Copyright
Adding new copyrights is done by defining rules which add new copyright info (to the copyrights found in a source file) - or "on top level".
Conditions:
-
pathof the file within the sources (defined as a regular expression; if omitted the copyright will be applied on "top level")
Data:
-
copyright: the copyright string to add
Acceptance criteria
- Rules for deleting/adding licenses and copyrights are implemented and might be used.
- The user guide of Solicitor is updated