scancode-toolkit icon indicating copy to clipboard operation
scancode-toolkit copied to clipboard

Improve the license_clarity_score logic for conflicting_license_categories

Open DennisClark opened this issue 1 year ago • 5 comments

In a recent scan of VictoriaMetrics-1.93.9.tar.gz from https://github.com/VictoriaMetrics/VictoriaMetrics/archive/refs/tags/v1.93.9.tar.gz I noticed that the value of conflicting_license_categories was set to true for this package, which has an overall license of apache-2.0. A perusal of the other detected licenses in the package showed that there are some files licensed under ofl-1.1 which is currently assigned a category of "Copyleft Limited" in the LicenseDB. It did not seem fair or correct to me that code with this license should trigger a "conflicting" condition on the overall project, since in any practical sense the ofl-1.1 should not have any license compliance impact on the use of the project, other than the usual attribution requirements.

I discovered more information about that at https://apache.org/legal/resolved.html#weak-copyleft-licenses where the official Apache guidelines inform us that it is OK to include "weak copyleft" licenses in an apache-licensed project, including the ofl-1.1. This encouraged me to conclude that code licensed under a"Copyleft Limited" (weak copyleft) license should NOT trigger a conflicting_license_categories condition when used in a project with an overall permissive license. That condition should only result from code under "Copyleft" (strong copyleft) or proprietary/commercial licenses. I think the license_clarity_score logic should be improved to reflect that.

We should also improve the description of conflicting_license_categories wherever it is presented to specify "strong copyleft" rather than just "copyleft" as follows:

"When true, indicates the declared license expression of the software is in the permissive category, but that other potentially conflicting categories, such as strong copyleft and proprietary, have been detected in lower level code. Scoring Weight = -20 (note negative weight)."

DennisClark avatar Dec 12 '23 19:12 DennisClark

attaching the original scan results

summary-2023-12-12-18-23-19.json

DennisClark avatar Dec 12 '23 19:12 DennisClark

@DennisClark Thanks for the report. I'll first check the code that sets conflicting_license_categories flag and see if some logic there is causing the issue or if it is from the ofl-1.1 licensed Resources.

This encouraged me to conclude that code licensed under a"Copyleft Limited" (weak copyleft) license should NOT trigger a conflicting_license_categories condition when used in a project with an overall permissive license.

We will have to update the clarity scoring code to keep track of specific cases like this.

JonoYang avatar Dec 12 '23 21:12 JonoYang

@JonoYang Actually I think this should apply to Copyleft Limited in general.

DennisClark avatar Dec 12 '23 21:12 DennisClark

@DennisClark

Looking at the code for the license clarity scoring, Copyleft Limited is not in the list of conflicting license categories (https://github.com/nexB/scancode-toolkit/blob/develop/src/summarycode/score.py#L384) .

I've run a scan of VictoriaMetrics-1.93.9.tar.gz and I see that these files have the bsd-new_or_gpl-2.0_30.RULE rule matched to them. This rule is both Permissive and Copyleft, so the conflicting_license_categories flag was set to True, even though the declared license expression for these files are mit.

VictoriaMetrics-1.93.9/vendor/github.com/valyala/gozstd/zdict.h
VictoriaMetrics-1.93.9/vendor/github.com/valyala/gozstd/zstd.h
VictoriaMetrics-1.93.9/vendor/github.com/valyala/gozstd/zstd_errors.h

I will take a look at only considering the declared license expression for determining if we have a license conflict.

JonoYang avatar Dec 13 '23 00:12 JonoYang

@JonoYang very interesting, and informative, results. thanks for doing the investigation. Maybe in the case of a dual (disjunctive OR) license expression that contains both a permissive and a copyleft, we should not make the copyleft license a factor in the "conflicting" setting, but consider the permissive license as the default choice.

DennisClark avatar Dec 13 '23 00:12 DennisClark