support deprecated SPDX license-ids
not sure if this is the lib that was referenced
Our AboutCode license-expression library treats the deprecated SPDX license-ids as unknown licenses in general, but it would probably be a good idea to recognize cases like GPL-2.0 as deprecated aliases rather than unknown.
Originally posted by @mjherzog in https://github.com/CycloneDX/specification/issues/454
Actually deprecated license ids are not reported as unknown but they are detected alright to the correct, updated license. Reporting deprecated licenses would be great!
There is a confusion about what deprecated means. license-expression does treat deprecated scancode licenses as unknown. It is different from a deprecated spdx license identifier.
I need the later for my project. The snippet below shows how you can get it working without changing the library. Instead of using the scancode database i switched to the spdx.org database and massaged the LicenseSymbol object a little to get it working with the new property.
I did not make depecrated symbols to raise an Exception, i need them working as valid, but i do want to inform the user about potential deprecations. After the parsing is done you can iterate over the symbols property of the parsing result and check for the is_deprecated property in each symbol and act accordingly.
import json
import pathlib
import urllib.request
import license_expression
SPDX_LICENSES_URL = "https://spdx.org/licenses/licenses.json"
SPDX_LICENSES_FILE = pathlib.Path(__file__).parent / "spdx-licenses.json"
class SpdxLicenseSymbol(license_expression.LicenseSymbol):
def __init__(self, key, aliases=tuple(), is_deprecated=False, is_exception=False, *args, **kwargs):
super().__init__(key, aliases, is_exception=is_exception, *args, **kwargs)
self.is_deprecated = is_deprecated
def __copy__(self):
return SpdxLicenseSymbol(self.key, self.aliases, self.is_deprecated, self.is_exception)
def __repr__(self):
attributes = [f"{self.key!r}"]
if self.aliases:
attributes.append(f"aliases={self.aliases!r}")
attributes.append(f"is_deprecated={self.is_deprecated!r}")
attributes.append(f"is_exception={self.is_exception!r}")
return f"{self.__class__.__name__}({', '.join(attributes)})"
def get_license_index():
if not SPDX_LICENSES_FILE.exists():
urllib.request.urlretrieve(SPDX_LICENSES_URL, SPDX_LICENSES_FILE)
with (SPDX_LICENSES_FILE).open("rb") as fd:
return json.load(fd)
def build_spdx_licensing() -> license_expression.Licensing:
return license_expression.Licensing(
SpdxLicenseSymbol(
entry["licenseId"],
aliases=tuple(), # there is no aliases on de spdx db
is_deprecated=entry["isDeprecatedLicenseId"],
is_exception=False, # there is no exceptions in the spdx db
)
for entry in get_license_index()["licenses"]
)
licensing = build_spdx_licensing()
expr = licensing.parse("eCos-2.0")
for sym in expr.symbols:
if sym.is_deprecated:
print(f"Ups. {sym} is deprecated")
@pcastellazzi Thanks! ... do you mind to submit a PR? :innocent: this would be great!
@pcastellazzi Thanks! ... do you mind to submit a PR? 😇 this would be great!
No problem, give me few days.
A few questions:
Should i replace the backend (scancode) for spdx? If that's the case there is an official repo with the files, should we submodule it, vendor it, or download their files when needed?
While there are no license exceptions in the spdx license db, there is an spdx exception db at https://spdx.org/licenses/exceptions.json.
is the license expression GPL-2.0-with-GCC-exception different than GPL-2.0 WITH GCC-exception?