json-ld-api icon indicating copy to clipboard operation
json-ld-api copied to clipboard

Expansion does not take property-scoped contexts for nested properties into account

Open niklasl opened this issue 5 years ago • 8 comments

There is currently a limitation in the JSON-LD 1.1 expansion algorithm which ignores property-scoped contexts for nested properties.

My specific case is:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://purl.org/dc/terms/",
    "bibo": "http://purl.org/ontology/bibo/",
    "Print": "bibo:Book",
    "name": "http://www.w3.org/2000/01/rdf-schema#label",
    "instanceOf": "@nest",
    "contributionByRole": {
      "@id": "@nest",
      "@context": {
        "agent": "@nest",
        "aut": "creator"
      }
    },
    "provisionActivityByType": {
      "@id": "@nest",
      "@context": {
        "Publication": {
          "@id": "@nest",
          "@context": {"date": "published", "agent": "publisher"}
        }
      }
    },
    "identifiedByType": {
      "@id": "@nest",
      "@context": {
        "Isbn": {"@id": "@nest"},
        "value": "bibo:isbn"
      }
    }
  },
  "@id": "book/one",
  "@type": "Print",
  "instanceOf": {
    "contributionByRole": {
      "aut": {
        "agent": {"name": "Some Body"}
      }
    }
  },
  "identifiedByType": {
    "Isbn": {
      "value": "1234567890"
    }
  },
  "provisionActivityByType": {
    "Publication": {
      "date": "1999",
      "agent": {"name": "PubCorp"}
    }
  }
}

From which I expected to get (recompacted for brevity):

{
  "@context": {
    "@vocab": "http://purl.org/dc/terms/",
    "bibo": "http://purl.org/ontology/bibo/",
    "name": "http://www.w3.org/2000/01/rdf-schema#label"
  },
  "@id": "book/one",
  "@type": "bibo:Book",
  "bibo:isbn": "1234567890",
  "published": "1999",
  "publisher": {"name": "PubCorp"},
  "creator": {"name": "Some Body"}
}

(This is actually part of a "trick" (a "context switcheroo" if you will), whereby by compacting BibFrame data into indexed maps, and then swapping out the context with another one mapping to DC+BIBO, as per above, using nests with local contexts, the same structure is semantically differently interpreted (a kind of "poor man's inference", supporting different vocabulary granularities). The input for the above example is thus actual BibFrame data plus an idiomatic context for the shape, using indexes.)

That aside, I do believe that in general, for JSON-in-the-wild, it is quite plausible that structures mapped using nesting properties will vary the meaning of keys within each nest. As it stands, the current algorithm, somewhat arbitrarily IMHO, lacks this possibility. Nothing in the Syntax spec makes it obvious that this is so. In the test suite itself, in the example of test case in06, you could quite plausibly imagine that keys within the defined nests ("links", "attributes" and "relationships") could have different meanings but share the same name (e.g. "links" could use a "title" key with a different meaning from "title" within "attributes").

Furthermore, as far as I can tell, index keys in indexed properties can have property-scoped contexts (assuming that I interpret test case c013 correctly). To support that but not nests is rather unexpected, as they both represent not properties but "sectioned" data.

For reference, see my original question about this on the Public JSON-LD mailing list, and @gkellogg's reply, which included (and I quote):

There was no specific intention to not support this, but looking at the expansion algorithm, it brings up more considerations: for example, we don’t support @context at the top-level of nested properties either.

Nominally, to do what you want would be to duplicate step 8 (including seeing if the key expanding to @nest has an @context defined on it), and see if nesting-key has an embedded context defined, and update active context for the recursive steps; we’d also need to revert after 14.2. Similar steps would be required in the compaction algorithm. We’d also need to consider if we wanted to act on an @context member of nested value after 14.2.2, and exclude the recursive step in that case.

It would be a fairly big change, IMO, and given that we’re about to release an updated CR, probably too late to do it.

Following up on that, I wonder what would happen if step 14.2.2 is just changed from:

    <li>Recursively repeat steps <a href="#alg-expand-each-key-value">13</a>
      and <a href="#alg-expand-resolve-nest">14</a>
      using <var>nested value</var> for <var>element</var>.

to:

    <li>Recursively repeat steps
      <a href="#alg-expand-initialize-property-scoped-context">3</a>,
      <a href="#alg-expand-property-scoped-context">8</a>,
      <a href="#alg-expand-each-key-value">13</a>
      and <a href="#alg-expand-resolve-nest">14</a>
      using <code>@nest</code> as <var>active property</var>, and
      <var>nested value</var> for <var>element</var>.

Would this really require reverting, as opposed to the regular active context state handling done in the algorithm? (Assuming of course that "recursively repeat" here, as is already implicit both here and elsewhere in the algorithm, properly scopes the passed variables to each step without polluting the other steps in the loop.)

(Note that step 3 also has to be assigned id="alg-expand-initialize-property-scoped-context" in the markup if the above is added.)

(Although @gkellogg also noted that the algorithm neither supports local (explicit) contexts at the top-level of nested properties, I consider that a separate (potential) issue for which I see less motivation for. As I see it, the main reason for these features to exist ought to be to handle idiomatic JSON with zero edits, and not "native" JSON-LD. I see no practical reasons for it (I personally see little practical reason for explicit local contexts at all in 1.1, since scoped contexts are possible).)

I do believe that there is a missed opportunity here, especially considering the complexity which has been invested into nesting properties and scoped contexts. The "zero edits" goal, which much of JSON-LD 1.1 has put its force behind, warrants this to be thoroughly considered, even at this late hour, IMHO. I'll do what I can to help, if that's needed.

niklasl avatar Feb 19 '20 21:02 niklasl

If the notion of "Recursively repeat ..." implies a separate variable scope, then steps 3 and 8, which modify active context could be considered an isolated scope and don't affect the value of active context for the next iteration.

At the least, the Note following 14.2.2 would need to clarify that those steps are done in a scope that does not allow those variables to change on successive repetition. Although, note that they are updating result, which could be considered a pass by reference.

gkellogg avatar Feb 22 '20 21:02 gkellogg

Also, I don't see a reasonable way to update the Compaction algorithm, so this would be a one-way transformation. Ideally, compacting a document with a context which uses a scoped-context on @nest would get back to the same place, but steps 12.7 and 12.8 of the Compaction Algorithm, which deal with nesting find terms using the existing active context, The inverse context is based on that, which is where we look for potential terms to return as item active property. It would require that the inverse context also include nested contexts for any in-scope nest properties having a scoped context so that they could introduce terms that make use of that scoped context, but this would lead to overlap and would be crazy complicated.

I'll do a PR that does expansion-only scoped contexts on aliases of @nest with some note that it doesn't work for compaction.

gkellogg avatar Feb 22 '20 21:02 gkellogg

@niklasl I addressed this in #388, which includes your specific example as a test case, along with a simpler on.

As I mentioned, not that it doesn't support compaction. (probably should say so).

Please let us know if this addresses your issue.

gkellogg avatar Feb 23 '20 22:02 gkellogg

As I mentioned, not that it doesn't support compaction. (probably should say so).

I'm worried that we can not keep the compaction algorithm untouched, though.

Consider the following context

{"@context":{
    "foo": "http://example.org/foo",
    "bar": {"@id": "http://example.org/bar", "@nest": "nst"},
    "nst": {"@id": "@nest", "@context": { "bar": "http://different.example.org/bar" }
}}

used to compact

{ "http://example.org/bar": "BAR }

I'm under the impression that it would result to

{ "@context": {...},
  "nst": { "bar": "BAR" }
}

in which bar is now interpreted as http://different.example.org/bar...

pchampin avatar Feb 24 '20 22:02 pchampin

That is indeed a problem, but I don't see how to solve it in compaction reasonably; maybe we can't support this feature.

gkellogg avatar Feb 25 '20 00:02 gkellogg

I don't see how to solve it in compaction reasonably

Steps 12.7.2.1 and 12.8.2.1 of the Compaction algorithm currently read:

If nest term is not @nest, or a term in the active context that expands to @nest, an invalid @nest value error has been detected, and processing is aborted.

We could add a condition that prevent compacting to a nesting property with a scoped context:

If nest term is neither @nest nor a term in the active context that expands to @nest, or if nest term is a term whose term definition contains a @context entry, an invalid @nest value error has been detected, and processing is aborted.

This wouldn't break existing test cases (as they don't have scoped context in nesting properties.

maybe we can't support this feature

On the expansion side, it is a natural addition, and @niklasl 's use case is valid. On the compaction side, it does open a can of worms, which we don't have time address more elegantly. I think however that this solution (including my patch above) is a move in the right direction, and does not preclude a future WG to improve the compaction in order to better support this feature.

pchampin avatar Feb 25 '20 09:02 pchampin

@gkellogg, Thank you for addressing this! I believe your change addresses my issue.

First, a remark on my initial assessment:

Furthermore, as far as I can tell, index keys in indexed properties can have property-scoped contexts (assuming that I interpret test case c013 correctly). To support that but not nests is rather unexpected, as they both represent not properties but "sectioned" data.

I did not interpret that correctly. (I missed that the outer nested property is turned, by type-scoped context, into a type-indexed property, where the index key represents a type which controls the inner index.) Thus the apparent inconsistency remark I made is certainly weakened. I still believe that adding this logic is consistent with what you can expect of expansion though.

I'm gonna elaborate my assessment of where this leads us, possibly feeding into any future work on JSON-LD (post 1.1):

I was, as you, concerned about this introducing an inconsistency with compaction. A possible mitigation, AFAIK, is that there are already certain forms that can be expanded using special contexts which cannot be compacted into the same shape. From what I've gathered (now also reported by @pchampin in #391), this is the case for nested nests (as seen e.g. in test case in06 which cannot be expanded and compacted back to the same form), nor can multiple keys mapping to the same IRI. I reckon that this may not be considered as on the same level of inconsistency as nest-scoped contexts, but it's hard to judge.

This compaction limitation does prevent some aspects of the "context switcheroo" trick I'm attempting, in that I cannot turn flat data (e.g. using DC) back into the higher-resolution forms (e.g. using BibFrame). It is interesting to note that #391 indicates that there might be a possible related "missing feature": allowing for nested properties to be explicitly declared as nested within nests. My experiment indicated that such a mechanism could be used in such a flat-to-nested manoeuvre, but it's early to tell from my tinkering, and closing time for JSON-LD 1.1, alas.

Maybe, there is room for a note about Compaction, to state that algorithms other than the one standardized are perfectly legal to use, as long as the produced data is conforming JSON-LD? That note could mention the examples above which aren't roundtrippable. I think that varying forms of compact JSON-LD does and may continue to abound, and that the expansion rules are the baseline contract here for interoperability. This is implicit throughout, but perhaps it can be spelled out more explicitly? (This "off the beaten spec path" goes, of course, for framing as well.)

Regarding @pchampin's similar findings (recursive nests) and suggestions for mitigation, I think they're in the right direction. The example with the term "pulling the rug out from under itself" is concerning though (and the only way to prevent it is to declare the outer bar as @protected, although of course it seems contrived to design such a context in the first place). To be honest, reflecting upon these aspects, it leads my mind to thinking that nests with contexts are somewhat at odds with the current side-by-side declaration of terms belonging to nests. In fact, nest-scoped contexts seem almost like an alternative to that form (implying changes to term selection and compaction, thus challenging the current design). And I certainly see why reworking nest logic even further at this stage of 1.1 is impossible.

niklasl avatar Feb 25 '20 10:02 niklasl

This issue was discussed in a meeting.

  • RESOLVED: Defer #380 for future version as (1) can be experimental as not forbidden, (2) borderline new feature in feature freeze, (3) dangerously asymmetric
View the transcript Nested property-scoped fields
Rob Sanderson: https://github.com/w3c/json-ld-api/issues/380
Gregg Kellogg: Last week we discussed that supporting this would be editorial, as no normative changes are required to make this work, only algorithm steps needs to be changed.
… Algorithm should be changed to take property-scoped context to be taken into consideration. The problem is that it is not feasible IMO to do the compaction side of this using the same context, because term selection could be based on prop-scope or nested terms. So we have to decide to only do this for expansion only, and accept that it will come out differently after compaction. What do you think?
Pierre-Antoine Champin: I agree. I don’t see a way to ensure round-trip of this. I’m not too concerned about that, because nested structures are not entirely round-tripable. For example, recursive lists are not round-trippable.
… So I could live with scoped contexts handled in expansion, and not handled in compaction.
… This raises another problem: current addition Gregg did, it is possible to expand something using scoped contexts, to get back a semantically different result, because expansion does not take into account scoped contexts.
… We should probably not merge this addition change because of this unforeseen consequence. Or we can refrain from compaction if there is a scoped context. It would be a shame, but we should at least be careful not to break compaction as it is.
Ivan Herman: Unsure I understand all the details. I doubt this is something we should do at this point. It sounds like something close to a new feature, and that is a no-no, which goes back to WD.
… We may come back to it at some point in time, which depends on what the future holds for JSON-LD.
… This sounds too much for a CR that we plan to close in a few weeks.
Rob Sanderson: pchampin, could you clarify the scenario where you expand+compact and get different semantics?
Pierre-Antoine Champin: https://github.com/w3c/json-ld-api/issues/380#issuecomment-590574392
Rob Sanderson: I agree that this is on the border of a new feature.
Pierre-Antoine Champin: If you compact the linked data, then it will compact with @nest with a custom meaning of bar in the scoped context. But if I re-expand, bar will have a different meaning.
Rob Sanderson: I can see this happening in practice.
Gregg Kellogg: This is editorial because the syntax doc doesn’t disallow this use-case. So we may need to do something to not make people infer that this is possible. The use case is nice, but not worth the disruption at this point. So we should defer this to a future version.
Pierre-Antoine Champin: https://github.com/w3c/json-ld-api/issues/380#issuecomment-590796364
Pierre-Antoine Champin: The last comment was also in that direction. The algorithm is not meant for this. Niklas agrees that this is not the right moment to handle this.
Ivan Herman: Ok, let’s defer.
Rob Sanderson: Does this also apply to your other issue pchampin?
Pierre-Antoine Champin: No, that one points out the assymetry.
Proposed resolution: Defer #380 for future version as (1) can be experimental as not forbidden, (2) borderline new feature in feature freeze, (3) dangerously asymmetric (Rob Sanderson)
Gregg Kellogg: +1
Ivan Herman: +1
Harold Solbrig: +1
Ruben Taelman: +1
Tim Cole: +1
Pierre-Antoine Champin: +1
Rob Sanderson: +1
Benjamin Young: +1
David I. Lehn: +1
Resolution #1: Defer #380 for future version as (1) can be experimental as not forbidden, (2) borderline new feature in feature freeze, (3) dangerously asymmetric
Gregg Kellogg: The problem with these types of things, compaction comes down to term selection algo, which is already insanely complicated. Anything adding more complexity should be carefully considered.

iherman avatar Feb 28 '20 18:02 iherman