Spdx-Java-Library icon indicating copy to clipboard operation
Spdx-Java-Library copied to clipboard

Potential enhancement: LicenseCompareHelper.isTextStandardLicense().getDifferenceMessage() should sort by line then column

Open pmonks opened this issue 1 year ago • 3 comments

Currently org.spdx.utility.compare.LicenseCompareHelper.isTextStandardLicense().getDifferenceMessage() will list each difference in a seemingly random order, but it would be beneficial if instead they were sorted by line number and then by column number, for readability.

pmonks avatar Mar 26 '24 20:03 pmonks

We can sort List<LineColumn> differences in the class DifferenceDescription.

https://github.com/spdx/Spdx-Java-Library/blob/718fd6573ff3c975f8e70c9d8b5bb0165837f46b/src/main/java/org/spdx/utility/compare/CompareTemplateOutputHandler.java#L645-L649

This can be done either at 1) setDifferences()+addDifference() or 2) getDifferences(). Performance hit will depend on how DifferenceDescription got used normally (set/add frequently vs get frequently).

The main change will be around differenceMessage because currently the string is created cumulatively at each addDifference() call and the sorting of differences list will not change the order of text order in the differenceMessage string.

A possible solution is to move the string generation outside of addDifference() and delay the generation until it is being used at getDifferenceMessage() call.

bact avatar May 06 '25 14:05 bact

Another thought: should the LineColumn class be enhanced to include all of the information that's displayed in the difference message? That would allow getDifferenceMessage() to become a "dumb" string generator around the List<LineColumn>, while (perhaps more importantly) providing all of the difference information as structured data available via an API to callers who want to generate their own difference message (e.g. as HTML).

Going from memory, I think what's missing from LineColumn are the "expected" and "actual" texts (but there might be more).

pmonks avatar May 06 '25 14:05 pmonks

I'll have to double check, but I think this is an order to the messages. The first message is where the match logic actually failed after exhausting all possible variable and optional texts. The subsequent messages are why any option or variable text failed their evaluations. When I use the messages to debug, the order is helpful to understand how the matching logic failed.

If you want to represent this order in a UI, you could do something like A failed a <line,column> after trying to match option or variable text failing at b <line,column>, ...

Also, the difference message doesn't actually capture all the differences in the license text - once it gets to a point where there obviously isn't a match, it just stops.

If we want to write a good UI for describing the changes and failures, I would suggest doing a standard text diff first to capture a complete list of differences in a highlighted, human readable way then using the difference messages highlight which actual diff caused the match failure.

goneall avatar May 06 '25 18:05 goneall