Spdx-Java-Library icon indicating copy to clipboard operation
Spdx-Java-Library copied to clipboard

Official Apache-1.1 license text is not being matched correctly by LicenseCompareHelper.matchingStandardLicenseIdsWithinText()

Open pmonks opened this issue 3 months ago • 1 comments

When org.spdx.utility.compare.LicenseCompareHelper.matchingStandardLicenseIdsWithinText() is run on the official Apache-1.1 license text, it fails to find any matches, and I believe I've narrowed down the problem to the Clause5 alternative text tag in the template; if I remove the example header from the license text, and run org.spdx.utility.compare.LicenseCompareHelper.isTextStandardLicense().getDifferenceMessage() on it, I get:

Variable text rule combined-bullet-Clause5 did not match the compare text starting at line #31 column #1 "5" while processing rule var: combined-bullet-Clause5

When I manually converted that <alt> tag into a Java regex, and bullet 5 from the Apache 1.1 license text is manually cleansed of comment characters and newlines, I do get a match, so I'm pretty confident the problem is in the library rather than the template. Beyond that I'm not really sure what the root cause might be - whether it has to do with comment character handling, regexification of that particular <alt> tag, or something else entirely.

This was reproduced with Spdx-Java-Library v1.11 and SPDX license list v3.23.

pmonks avatar Mar 14 '24 22:03 pmonks