license-expression
license-expression copied to clipboard
SPDX Failing to parse license for no obvious reason
Hi license-expression
. I must begin that this is a great piece of software, and I'm grateful for your contributions.
I noticed a strange edge case when using the spdx license parser. The parser raises an exception when I try to parse Sleepycat License
but is fine with Sleepydog License
or even Sleepyca License
.
Reproducible example:
SPDX_LICENSING = license_expression.get_spdx_licensing()
# ExpressionParseError: Invalid symbols sequence such as (A B) for token: "License" at position: 10
_ = SPDX_LICENSING.parse('Sleepycat License')
# Works
_ = SPDX_LICENSING.parse('Sleepydog License')
_ = SPDX_LICENSING.parse('Sleepyca License')
Relevant versions installed via conda.
python 3.11.4 h47c9636_0_cpython conda-forge
license-expression 30.1.1 pyhd8ed1ab_0 conda-forge
Thanks in advance!
@DamianBarabonkovQC Thanks for the report. Here is what happens:
-
Sleepycat
is a known SPDX identifier -
Sleepyca
,Sleepydog
andLicense
are not known identifiers
The basic, non-validating parsing does not validate if it recognizes nothing. If you use .validate()
https://github.com/nexB/license-expression/blob/dd54f5125428fc070637b7db6ca780b2cda63ca3/src/license_expression/init.py#L754 or .parse(validate=True)
https://github.com/nexB/license-expression/blob/dd54f5125428fc070637b7db6ca780b2cda63ca3/src/license_expression/init.py#L472 the expression in 2. will fail to parse too.
Somehow in the expression in 1. the SPDX "Sleepycat" is recognized and does validate further.
I reckon the behaviour is inconsistent and buggy.
- yields:
raise ExpressionParseError(
license_expression.ExpressionParseError: Invalid symbols sequence such as (A B) for token: "License" at position: 10
- with validate yields:
>>> _ = SPDX_LICENSING.parse('Sleepyca License', validate=True)
...
raise ExpressionError(msg)
license_expression.ExpressionError: Unknown license key(s): Sleepyca License
The first non-validated parsing failure is probably OK.
The second non-failure should fail either with Unknown license key(s)
or rather a Invalid symbols sequence
too