yaml-ld Define anchor usage in yaml-ld

As an json-ld editor … WHO I want to use yaml anchors … WHAT So that I can easily reuse content … WHY

Note

The specification should define:

when it is legitimate to use anchors
which are the expectation on anchor usage (e.g. do they represent a specific JSON-LD node or they can just be used to represent content?)
are there any constraint on anchor usage? (e.g. the representation graph MAY / MUST NOT be a cyclic graph...)

example 1

---
- "@id": &homer http://example.org/#homer  # Anchor the homer url
  http://example.com/vocab#name:
  - "@value": Homer
- "@id": http://example.org/#bart
  http://example.com/vocab#name:
  - "@value": Bart
  http://example.com/vocab#parent:
  - "@id": *homer                               # reuse the anchor instead of re-typing the homer url
- "@id": http://example.org/#lisa
  http://example.com/vocab#name:
  - "@value": Lisa
  http://example.com/vocab#parent:
  - "@id": *homer

example 2

Using anchor and alias nodes https://gist.github.com/ioggstream/31f3226fa9976b3baf0800f44bc19c98

May 30 '22 13:05 ioggstream

Example 2 is from the d3fend.mitre.org cybersecurity ontology.
YAML spec: https://yaml.org/spec/1.2.2/#anchors-and-aliases
Anchors and Aliases can represent non-tree graph structures, whereas JSON is a tree
The above are a bit untypical examples of reusing small fragments of YAML. The typical example is reusing a whole RDF node, which in JSONLD happens by @id. Nevertheless, using YAML Anchors and Aliases ensures referential integrity within the document (that the @id is not mistyped).
We should describe how Anchors and Aliases could mesh with JSON-LD Frames

May 31 '22 09:05 VladimirAlexiev

One point where I believe YAML anchors can help are the description complex of contexts. E.g.

{
  "@context": {
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "@vocab": "http://example.com/ns/Company/",
    "founder": { "@context": {
        "@vocab": "http://example.com/ns/Person/",
        "birthDate": { "@type": "xsd:date" }
    }},
    "employee": { "@context": {
        "@vocab": "http://example.com/ns/Person/",
        "birthDate": { "@type": "xsd:date" }
    }}
  }
}

Notice that the scoped contexts of founder and employee are exactly the same (a "person" context). With Yaml anchors, this redundancy could be elimiinated.

NB: there are other means to get rid of this redundancy in pure JSON-LD:

hosing the "person" context at a different URL and use that URL instead
define a type-scoped context for a type Person, and expect values of founder and employee to be explicitly typed

but they have their drawbacks that are not always acceptable.

May 31 '22 11:05 pchampin

That's exactly the kind of discussions and examples we need :)

"@context":
  xsd: http://www.w3.org/2001/XMLSchema#
  "@vocab": http://example.com/ns/Company/
  founder:
    "@context": &person-context
      "@vocab": http://example.com/ns/Person/
      birthDate:
        "@type": xsd:date
  employee:
    "@context": *person-context

May 31 '22 12:05 ioggstream

how Anchors and Aliases could mesh with JSON-LD Frames

Frames specify which nodes to expand, and which nodes to merely refer to by URI. So in some sense they tackle the "graph vs tree" problem.

Anchors and Aliases tackle the same problem; intuitively I feel in a more general way.

So: what can be the connection between them?

Jun 02 '22 08:06 VladimirAlexiev

I am not entirely clear on how anchors would actually affect the LD part of the picture. Having a YAML document with anchors, we're going to convert it to JSON — and in that conversion, the anchors will be resolved. Thus, a JSON-LD processor that we will subsequently use won't know anything about those anchors.

This is similar to C preprocessor directives which are resolved before the source file is consumed by the compiler itself.

Is that right? If yes, can't we safely ignore these particular YAML features relying upon YAML spec to describe them?

Jun 02 '22 16:06 anatoly-scherbakov

Of course, JSON-LD does encode a graph in JSON; what used to be called a node reference is of the form {"@id": "..."}. Framing has an @embed keyword that can control how this works with one or all instances of a node referenced either fully or as a reference.

The YAML anchor/alias mechanism is similar the the framing keys, and also similar in concept to the @included keyword.

For now, I think we need to be cautions on depending on any YAML features beyond JSON re-serialization until we understand the requirements for round-tripping. a YAML-LD extended profile could allow us to move beyond what can easily be represented in JSON-LD, and we need to tread carefully.

Jun 02 '22 16:06 gkellogg

Anchors can be used to define fragment IDs inside YAML instance data, like attributes @id and href/@name do in HTML.

@ioggstream where was your proposal for such fragments? In addition to anchors, it used JSON Path to address any element in the JSON/YAML structure.

Eg if at https://example.com/TheSimpsons.yaml we have:

*Bart:
  name: Bart Simpsons
  gender: male

Then the alias would be resolved to https://example.com/TheSimpsons.yaml#Bart

The same in plain YAML-LD would look like this:

- "@id": Bart
  name: Bart Simpsons
  gender: male

--

@anatoly-scherbakov basically says that anchors/aliases must be resolved by the YAML processor and elided, i.e. anchors can only be used locally inside one file. Furthermore, the shared info must be copied out during the resolution. I like @pchampin's concrete example of using aliases to express a context more economically. But being a graph person, I dislike expanding shared graph structures by copying them out.

--

If anchor-based data sharing is necessarily local (limited to one file), then perhaps we can use it at least for blank nodes and avoid copying? Eg

valve1:
  temperature: *temp100C
    value: 100
    unit: degC
valve2:
  temperature: &temp100C

Should result in this turtle

<valve1> :temperature _:temp100C.
<valve2> :temperature _:temp100C.
_:temp100C :value 100; :unit <degC>.

and NOT this one:

<valve1> :temperature [:value 100; :unit <degC>]
<valve2> :temperature [:value 100; :unit <degC>].

Jun 07 '22 12:06 VladimirAlexiev

@VladimirAlexiev let me try to clarify your examples:

Syntax tweak. A keyword cannot start with `*`; Anchor is attached to a node.

Bart: &BartSimpsons  #  create an anchor to this node.
  name: Bart Simpsons
  gender: male

I don't think that this can implicitly map to a @id: Bart because Anchors are a serialization details. The above document can be legitimately be serialized as

Bart: &anchor001  #  create an anchor to this node.
  name: Bart Simpsons
  gender: male

Homer:
  children:
  - *anchor001  # An Alias references an anchor.

Representation graph

iiuc the yaml below

t100: &t100 100
valve1:
  temperature: &temp100C
    value: *t100
    unit: degC
valve2:
  temperature: *temp100C

maps to the following YAML rep. graph

graph LR;
  root --> t100 & valve1 & valve2
  t100 --> 100
  valve1 --> temperature1[temperature] -->temp100C --> value & unit
  value --> t100
  unit --> degC
  valve2 --> temperature2[temperature] -->temp100C

The first question I asked myself is: how do pyyaml process this information?.

pyyaml preserves reference when parsing mutable structures to a dict()

temperature = yaml.safe_load(temperature_yaml)  # see doc above
assert temperature['valve1']['temperature']['value'] == 100
assert temperature['valve2']['temperature']['value'] == 100
# assign a new temperature
temperature['valve1']['temperature']['value'] = 200
assert temperature['valve2']['temperature']['value'] == 200  # Changed.

but acting on an immutable structure, things changes

assert temperature["t100"] == 100
assert temperature['valve2']['temperature']['value'] == 100
temperature["t100"] = 200
assert temperature['valve2']['temperature']['value'] == 100  # Not changed.

Jun 09 '22 09:06 ioggstream

Sharing and Cycles (Frames)

Frames are quite key because they define what part of an RDF graph and how to unroll it to a JSON tree.

@gkellogg in #44

The JSON-LD Framing algorithm is quite complicated as it is.

Agreed, and I don't even know it properly. Of course, we'll use it whole-cloth without modification.

But I intuitively feel that anchors may have something to do with Frames because both address (to some degree) the problem "given a graph, how to serialize part of it as a tree". Both allow to share objects and handle cycles (to avoid infinite embedding), but:

JSON-LD can share RDF nodes and nothing else
YAML-LD anchors can share finer-grain structures: node URLs, single literals, pieces of objects (similar to @included)

Modularity/Structuring

@pchampin

anchors can help in the description of complex contexts

JSON Schema has special modularity/structuring facilities, see https://json-schema.org/understanding-json-schema/structuring.html

JSON-LD doesn't have such advanced facilities, so JSON-LD contexts tend to be gigantic.
- in a specific project we've assembled https://github.com/gs1/EPCIS/blob/master/epcis-context.jsonld from a bunch of files in https://github.com/gs1/EPCIS/tree/master/JSON-LD-Context, leading to numerous bugs https://github.com/gs1/EPCIS/issues/307 (in particular see Bug 6 duplication)
JSON-LD modularity is based on including contexts by URL
- But just like YAML anchors vs node URLs, Schema inclusion feels finer-granularity than context inclusion
Schema even has $anchor that's very similar to YAML anchors (but not used as often as JSON Pointers and the "standard place" $defs) If we adopt #54, we should think about "merging" JSON Schema anchors with YAML anchors

So the question of YAML fragments and pointers, and how they relate to Schema fragments and JSON Pointers, is key. @ioggstream has been struggling with this problem: please take charge of this, keep up the fight, and we'll help as much as we can!

Syntax tweak

Thanks!

Representation graph

Yes, but the alias "nodes" t100, temp100C are quite different from the others because they carry no info and instead are just redirection pointers (so maybe use a different color).

Jul 11 '22 06:07 VladimirAlexiev

This issue was discussed on the Aug 03 meeting.

Aug 03 '22 17:08 gkellogg

yaml-ld yaml-ld copied to clipboard

Define anchor usage in yaml-ld

Note

example 1

example 2

Syntax tweak. A keyword cannot start with *; Anchor is attached to a node.

Representation graph

Sharing and Cycles (Frames)

Modularity/Structuring

yaml-ld
yaml-ld copied to clipboard

Syntax tweak. A keyword cannot start with `*`; Anchor is attached to a node.