spdx-spec icon indicating copy to clipboard operation
spdx-spec copied to clipboard

Simplify Snippet Ranges representation in YAML/JSON

Open tsteenbe opened this issue 5 years ago • 9 comments

In SPDX 2.2 example Snippet Byte Range and Snippet Line Range are currently encoded as RDF (see below code) but elswhere we consistently followed SPDX tag:value. Propose to align ranges wit tag:value for consistecy and make adoption/generation easier.

    ranges:
    - endPointer:
        offset: 420
        reference: "SPDXRef-DoapSource"
      startPointer:
        offset: 310
        reference: "SPDXRef-DoapSource"
    - endPointer:
        lineNumber: 23
        reference: "SPDXRef-DoapSource"
      startPointer:
        lineNumber: 5
        reference: "SPDXRef-DoapSource"

Change to:

SnippetByteRange: "310:420"
SnippetLineRange: "5:23"

tsteenbe avatar Jun 11 '20 16:06 tsteenbe

I'd agree. My only question, if we're reopening it, is whether it would be more useful / easier to adopt to represent it as integers -- rather than as a string with a colon that needs to be parsed.

So maybe something like (in JSON format):

SnippetByteRange: [310, 420]
SnippetLineRange: [5, 23]

or

SnippetByteRange: {"start": 310, "end": 420}
SnippetLineRange: {"start": 5, "end": 23}

or

SnippetByteStart: 310
SnippetByteEnd: 420
SnippetLineStart: 5
SnippetByteEnd: 23

I don't have a strong feeling and I know I tend to overcomplicate things, so I'm fine with what @tsteenbe proposed above too.

swinslow avatar Jun 11 '20 19:06 swinslow

The snippet format is overly complex in my opinion and perhaps it should be simplified in 3.0. Having the reference is redundant and really increases the complexity. I like @swinslow proposals above as it doesn't require any string parsing to implement.

I personally think the most JSON like approach would be:

SnippetByteRange: {"start": 310, "end": 420}
SnippetLineRange: {"start": 5, "end": 23}

As background, the reason for this complexity is we wanted to leverage an existing W3C standard definition of how to describe snippets. Although this is useful in the RDF world, I don't think it helps anything in the JSON, XML and YAML representations. We could easily translate between a simpler format and the RDF prior to 3.0.

A few general comments we should probably sync up perhaps during a tech call:

In SPDX 2.2 example Snippet Byte Range and Snippet Line Range are currently encoded as RDF

That is not actually the case - it follows the object model. The RDF also follows the object model. Tag value also follows the model but is a bit different since it had a requirement to be "greppable". I do not think greppable is a requirement for JSON and YAML.

but elsewhere we consistently followed SPDX tag/value

There's actually a lot of cases where we don't use the same exact terms as tag/value. The current examples follows the model which tag/value also mostly follows.

Propose to align ranges wit tag:value for consistency and make adoption/generation easier.

I would propose that we make the JSON and YAML formats most natural to producers and consumers of JSON and YAML while adhering to a common object model and common vocabulary. If that happens to align more with tag/value, then that's OK. If it happens to align more with RDF, that's OK too.

goneall avatar Jun 11 '20 20:06 goneall

I like @swinslow proposals above as it doesn't require any string parsing to implement.

In my initial proposal I tried to follow tag:value but personally also prefer to remove the requirement to do string parsing.

I would also support simplifying it to:

SnippetByteRange: {"start": 310, "end": 420}
SnippetLineRange: {"start": 5, "end": 23}

P.S. if we like to avoid string parsing should we have at simplifying creators?

tsteenbe avatar Jun 12 '20 12:06 tsteenbe

P.S. if we like to avoid string parsing should we have at simplifying creators?

+1 from me :)

"creators": [ {"type": "Tool", "value": "LicenseFind-1.0"}, ... ]

?

swinslow avatar Jun 12 '20 13:06 swinslow

P.S. if we like to avoid string parsing should we have at simplifying creators?

+1 for me too. Probably a 3.0 enhancement since it is a but more structural. I would also like to change the RDF/XML to not require parsing.

We have a similar opportunity/issue with the originator and suppliers with emails.

goneall avatar Jun 13 '20 16:06 goneall

Moving to 2.3

goneall avatar Mar 09 '22 22:03 goneall

@tsteenbe @swinslow I'm thinking this should be moved to 3.0 since it would be a breaking change. Agree?

goneall avatar May 18 '22 18:05 goneall

@goneall I tend to agree, since the JSON and YAML formats were introduced in 2.2, a change here would be a breaking change and would make 2.2 JSON / YAML documents no longer valid. So I'd say that means this is for 3.0.

swinslow avatar May 18 '22 20:05 swinslow

Proposal will be a breaking change, so call concensus is to move it to 3.0

kestewart avatar May 24 '22 17:05 kestewart

This is resolved in 3.0 - closing

goneall avatar Apr 04 '24 17:04 goneall