json-ld-bp icon indicating copy to clipboard operation
json-ld-bp copied to clipboard

JSON-LD/RDF Round tripping?

Open azaroth42 opened this issue 7 years ago • 6 comments

As discussed on 2018-07-06 call, one aspect of the work is the extent to which JSON-LD instances can be round tripped through RDF, and vice versa.

In particular, questions arose as to:

  • Should any valid RDF graph be able to be serialized 100% faithfully in JSON-LD, such that it can be parsed back to the exact same graph?
  • Conversely, should any JSON-LD instance be able to be parsed to an RDF graph, and back to the same JSON-LD?

Currently the answer to both questions is no:

  • RDF has constructs that cannot be serialized to JSON-LD: An rdf:List of rdf:Lists is not able to be serialized
  • JSON-LD allows constructs that are not valid in RDF: Properties can be blank nodes

Work to be done:

  • Discuss the degree to which support for round tripping, in each direction, is important to us.
  • Determine the extent to which round tripping is not possible for both directions, with references to the specs.
  • Provide test cases for each non-round-tripping scenario
  • Document in the specs where round tripping is not possible

Current taskforce: @azaroth42 @workergnome @gkellogg @mixterj (add yourself if you're interested!)

azaroth42 avatar Jul 09 '18 16:07 azaroth42

The other thing complicating JSON-LD => RDF => JSON-LD round-tripping is array ordering. For example, an object with two values for rdf:type will have a random order when re-serialized to JSON-LD, as will any other cases where there are multiple values. This could possibly be addressed by a post-processing step when ordered all such values (being careful not to do this for lists), which would also require a valuator to provide to sort. While this might be important for test evaluation, it's of little value in the real world, and imposes a processing penalty that's probably not worth it.

Same thing for blank-node labels, and something like RDF Dataset Normalization, performed at the JSON-LD level, might be necessary to compare JSON-LD structures with each other.

One possible solution would be to largely abandon JSON to JSON comparisons and always compare the resulting RDF datasets. This would be a change from 1.0, and would not allow comparing non-RDF things such as @data values.

gkellogg avatar Jul 10 '18 00:07 gkellogg

Which is related to this issue where changing the ordering will change the compaction, as different scoped contexts would be chosen.

azaroth42 avatar Jul 10 '18 18:07 azaroth42

Can we try to be precise? We have two different issues:

  1. Round tripping to and from an RDF graph
  2. Round tripping to and from a particular RDF syntax, most probably Turtle

On (1), I believe it is true that:

  • Every RDF graph can be put into JSON-LD
  • Some JSON-LD cannot be converted into (2), e.g., the blank node properties cited by @azaroth42.

The more problematic case is (2), also described in https://github.com/w3c/json-ld-bp/issues/13#issuecomment-403660724.

I think that, as far as the BP document goes, my take is that we should not use non-RDF features. We hardly had any use cases for literal-as-subject or bnode-as-predicates, and users should try to avoid that. I am not sure it is worth going way beyond that in the document.

As for (2), I am not sure what we can say.

iherman avatar Dec 11 '19 16:12 iherman

Can we separate out some tasks here?

Provide test cases for each non-round-tripping scenario

is clearly not in-scope for a best practices document. On the other hand, it seems very reasonable to me to offer some advice about problematic constructions here in the BP doc.

Do we actually have any use cases for literal-as-subject or bnode-as-predicates? I'm not asking because I think we do not: I genuinely do not remember hearing any, but I had plenty of chances to miss them! :grimacing:

ajs6f avatar Dec 16 '19 16:12 ajs6f

Given the evolution of the "bnodes-as-predicates" discussion, I'm not entirely sure what we want to say here now. Do we want to say anything about JSON-LD 1.0 that uses bnodes and predicates and what that means for roundtripping?

ajs6f avatar Feb 07 '20 16:02 ajs6f

This issue was discussed in a meeting.

  • No actions or resolutions
View the transcript issue on RDF roundtripping
Benjamin Young: https://github.com/w3c/json-ld-bp/issues/13
Benjamin Young: 13 minutes left …
… Lets look at 13, the round tripping one
… take a minute to go through comments together and then as people have thoughts, chime in
Pierre-Antoine Champin: To be sure … the original text of the issue mentions lists of lists which is out of date for lack of support
Benjamin Young: BP should call out when one is required to accomplish it, people can already have json-ld content
Ivan Herman: rob, I think you raised this … what do we mean by round-tripping. Strictly rdf model level, or is it from a particular serialization and round trip back tot he closest one?
… rdf has “wonderful” feature of a zillion different ways to express it
Rob Sanderson: I was thinking that if you have JSON-LD and you parse it into the abstract graph
… then manipulate it and re-serialize it, that you end up at the same place.
… does all of the information in the JSON-LD make it into the RDF graph and then back into the output JSON-LD
… and then there’s framing to make sure you get it exactly into the same tree structure…or not
… if you’re JSON-LD with blank nodes and go into RDF and back into JSON-LD then those blank nodes will vanish
… so work to be done on this issue is still correct
… find the things that are important to note in these scenarios
… and document in the best practice: if you want to have your data round-tripable, here’s what you should and shouldn’t do
… and Gregg’s comments that follow around array ordering is interesting…especially if you’re thinking JSON
… and thinking only JSON-LD then you get consistent blank node naming, but not if you go through RDF
Ivan Herman: CURIE in JSON-LD then into Turtle, does the CURIE stay the same? or are the rules different?
Rob Sanderson: I don’t know
Pierre-Antoine Champin: the way I read this was round trip between rdf abstract syntax and json-ld
… I don’t know if I can round trip turtle to turtle
… in fact I know that I can’t, e.g. the curies, as the implementations don’t have a notion of prefixes as they’re not part of the abstract syntax
… shouldn’t talk about other syntaxes, only abstract and json-ld
Rob Sanderson: +1 to pa
Pierre-Antoine Champin: Maybe it’s worth mentioning that … other syntaxes have their own problems that we can’t address
Benjamin Young: Good points!
… do we feel like we have a complete list on the issue?
… or can we write in other issues to the issue
… the things that we’re up against and whether we can solve them
Rob Sanderson: throw in whatever notes you can
Benjamin Young: consolidate and summarize from what you’ve read above would be good
… or just take that on to make sure the end of the issue we have something to take forward
… rather than rereading the whole time

iherman avatar Feb 07 '20 18:02 iherman