ort icon indicating copy to clipboard operation
ort copied to clipboard

Snippet Model cannot be mapped from FossID multiline range matching

Open nnobelis opened this issue 1 year ago • 18 comments

Currently, for performance reasons (see https://github.com/oss-review-toolkit/ort/issues/7028), the matched lines are not fetched from FossID.

However it seems the current snippet model is not capable of representing certain FossID responses.

For instance, for a given snippet match in FossID, I receive the following answer for the listMatchedLines function (truncated for clarity):

"data": {
		"local_file": {
			"1": "1",
			"2": "2",
			"3": "3",
			"4": "4",
			"5": "5",
			"6": "6",
			"7": "7",
[...]
			"19": "19",
			"20": "20",
			"21": "21",
			"22": "22",
			"23": "23",
			"24": "24",
			"45": "45",
			"46": "46",
			"47": "47",
			"48": "48",
			"49": "49",
			"50": "50",
			"51": "51",
[...]
			"86": "86",
			"87": "87",
			"88": "88",
			"89": "89",
			"90": "90",
			"91": "91",
[...]
			"673": "673",
			"674": "674",
			"675": "675"

local_file means this is the lines of the source file being matched. If one "compress" these lines into line ranges, the result is: 1-24 and 45-675.

Now this information is supposed to go in SnippetFinding.sourceLocation : https://github.com/oss-review-toolkit/ort/blob/6f6e91759730ec20dc172b60612d5d76ab35c232/model/src/main/kotlin/SnippetFinding.kt#L31

Unfortunately, TextLocation in ORT can carry only two integers for the line information.

So what can be done ?

  • Split the snippet match in two SnippetFinding, to represent the two ranges ? This could get messy for more complicated matches.
  • Replace TextLocation for the snippet finding by TextRangeLocation, a new class that can specify multiple range of lines ?
  • Only map the first range to the TextLocation and ignore the rest (or store it in another property)?

As a side note, ScanOSS seems to deliver a single range:

    "lines": "1-710",
    "oss_lines": "1-710",

but this is only an assumption, as I couldn't reproduce more complex cases of matching.

nnobelis avatar May 25 '23 15:05 nnobelis