NimYAML icon indicating copy to clipboard operation
NimYAML copied to clipboard

Anchor in ignored tag

Open theamarin opened this issue 1 year ago • 3 comments

Hi @flyx,

thank you so much for your unflinching support!

If I define an anchor in a scope that is ignored by NimYAML, it won't be recognized.

Non-working example:

import yaml/serialization

let input = """
anchors:
 - &myAnchor
   id: anchor
items:
 - *myAnchor
"""

type
   Main {.ignore: ["anchors"].} = object
      items: seq[Item]
   Item = object
      id: string

var main: Main
load(input, main)

With the following output:

Error: unhandled exception: alias node refers to object of incompatible type [YamlConstructionError]

Please note that in yaml/serialization.nim:1211, e.aliasTarget is notin c.refs, so val.tag is empty.

I am not sure if this should be fixed or if just the error handling should be adapted, e.g., as follows:

    if e.aliasTarget notin c.refs:
      raise constructionError(s, e.startPos,
        "alias node refers to unknown anchor, possibly in ignored scope")

If you want to fix this, you would probably need to construct the data (and add it to the context refs) when the yamlAlias is reached, as otherwise its data type is unknown.

Any feedback is appreciated.

Thank you!

theamarin avatar Aug 04 '22 12:08 theamarin

There are two issues here:

The first is the problem of referencing anchors in ignored structure. This is a non-trivial problem due to the way NimYAML deserializes data currently. There are some interesting nuances:

anchors:
 - &a
   id: &b anchor
items:
 - id: *b
 - *a

If NimYAML was to say „well I'll remember the anchored events for later“ then in this instance, it'd remember both the &a and the &b anchored events, where the second is within the first. Then at *b, we can create a string from anchor. At *a, we will now have the situation where we need to generate a value but parts of the value have already been constructed. Not only will the construction of *a need to construct a value seen earlier, it also needs to inject that value into the stored events of &a to make this fully correct.

The second issue is that you want to deserialize an alias node into a non-ref type. This would currently fail even without the ignored part, e.g.

items:
  - &a
    id: droggeljug
  - *a

You'd get

Error: unhandled exception: Anchor on non-ref type [YamlConstructionError]

Should NimYAML support that in this special case? I don't think so; the rule

you can only have anchors on ref types

is simple while

you can have anchors on ref types and in ignored sections as long as they are referred to at most once

makes the API rather confusing.

When we take the second issue out of the equation by making Item a ref object, I think NimYAML should be able to properly load this. This will not be trivial to implement and may take a while, but I'll try to do it.

flyx avatar Aug 04 '22 13:08 flyx

Thank you again for your quick reply and thorough analysis! I am absolutely happy to make Item a ref object to simplify things. It absolutely makes sense to allow anchors only on ref types, I should have spotted this one.

theamarin avatar Aug 04 '22 13:08 theamarin

Workaround:

var node: YamlNode
load(input, node)

var stream = represent(node, tsNone, asNone)
discard stream.next() # skip stream start event
var main: Main
construct(stream, main)

This loads the YAML into a node graph, then generates a stream from those with asNonewhich will duplicate a node each time an alias would be produced. That stream can then be loaded into the given type. I would suggest using this approach for situations where YAML anchors are used like variables, however I seriously need to fix the problem of this going in an endless loop when there's a loop in the node graph.

flyx avatar Aug 04 '22 13:08 flyx

This use-case has now been implemented via

import yaml / [serialization, dom]

let input = """
anchors:
 - &myAnchor
   id: anchor
items:
 - *myAnchor
"""

type
   Main {.ignore: ["anchors"].} = object
      items: seq[Item]
   Item = object
      id: string

var main: Main
loadFlattened(input, main)
echo $main

loadFlattened replaces aliases with resolved content before loading and basically implements the workaround I've shown. It is available from the dom API since it uses the DOM as shown.

Error messages have been updated appropriately.

flyx avatar Sep 07 '22 14:09 flyx

I am using the workaround you proposed above in my code quite a lot by now, as it leverages the need to use ref objects whenever anchors are used.

Just as a reference for others who might want to use this, here is how it looks like with nim-yaml v2:

var s: FileStream = newFileStream(filePath)
var node: YamlNode
load(s, node)
s.close()

# Inline anchors
var stream = represent(node, SerializationOptions(anchorStyle: asNone))
discard stream.next() # skip stream start event
var main: Main
construct(stream, main)

theamarin avatar Nov 28 '23 20:11 theamarin