yaml-spec
yaml-spec copied to clipboard
Anchor declaration before use
Currently it is expected that a anchor declaration will come before its use as an alias, i.e.
- &foo foo
- *foo
So this is not allowed:
- *foo
- &foo foo
This is very unexpected for manual editing (and parsers that construct an object tree have no problem supporting this construct).
It is possible for the same anchor to be defined multiple times. An alias refers to the most recent matching anchor:
- &foo A
- *foo # A
- &foo B
- *foo # B
If this were not the case — if an anchor could only be defined once in a document — then it would be sensible for an alias to resolve to the appropriate anchor regardless of their relative locations. But because anchors can be reused, alias resolution must depend on relative location.
If the document you suggest in your second example were supported, it would have to be as a special case, e.g. “if an alias is encountered which does not match any anchor so far, then search forward for the first matching anchor”. As a result, the behavior of the first anchor with a given name would differ from other anchors with that name in a way that may be inobvious.
In addition, this would make composition more complicated. Right now, when an alias is encountered, then either it corresponds to a node that we've already composed or it doesn't correspond to anything (and it's an error). In an implementation, this can be handled easily by keeping a mutable map from anchor names to composed nodes and updating the map when a new anchor is encountered.
If aliases could refer to anchors defined later in the document, then this would complicate matters. An unmatched alias could not be immediately identified as an error; rather, the implementation would have to keep separate track of unmatched aliases and patch them into the composed document later if a match is found. Once the entire document had been composed, the implementation would check for remaining unmatched aliases and only then produce an error.
So while the suggestion makes sense in a model where anchors are unique, it runs into conceptual problems when anchors are reused and it would make implementation more complicated.
If a future version of the YAML spec were to forbid duplicate anchors (a matter on which I take no position), then at that point I think that it might make more sense to allow out-of-order aliases. The implementation costs would still exist, but the conceptual and consistency problems would vanish.
See also #48
For a (C/C++) programmer, not beeing able to use something that hasn't been declared comes quiet naturally. Also, YAML appears to be constructed in such a way, that everything should be one-pass processable. This gives great opportunities for speed optimizations and/or memory reduction: All nodes can be evicted from memory as soon as the walk (depth-first, wich is along the YAML text) goes to a sibling or recedes to a parent node -- unless the node had an anchor. If you allow forward references, e.g. alias some node that is anchored later on, you don't know the content at the moment you stumble upon the alias. You have to come back to that part of the tree. This complicates algorithms a lot.
I also second Thom1729's opinion.
In practice, if you dislike going down into every detail in the place where you need a node for the first time, there is a solution: Define it somewhere (earlier), where it is more convenient, with an anchor, and then alias it in the place, where you want to use it (without too many details). I've hand-written YAML structures of 2MiB (well, with some generated lists, of course) in that manner, that are still very much understandable for humans, and can be processed by rather simple tools.
@UnePierre,
At this point the general team consensus is not moving towards supporting anchors-after-aliases for future versions.
What we are considering is having the ability to uses the *...
syntax become a general "reference" node.
-
*foo
- reference to an anchor (same as now) -
*/foo/0/bar
- absolute "ypath" reference -
*../foo
- relative path reference -
*foo/../../bar
- reference to a path relative to an anchor
I'm sorry that you've had to maintain 2MB YAML files. :(
We're also working on file composability and other things that will ease the pain people have now with using YAML at scale. Most of these things can actually be accomplished already with YAML 1.2 syntax.
Now that we are done publishing the 1.2.2 spec, we expect to roll out these kinds of future plans as a set of RFCs in the coming weeks. Stay tuned!
Sorry if you misunderstood my post. I'm very happy with YAML and its capabilities! It's more like: although my files contain 2MiB+ of useful data, the representation is still such that I'm able to cope. As are my peers. And the tools we wrote that work on this amount of data aren't slow, either.
I'm looking forward to a "ypath" concept. Probably out of scope: will there be references into other YAML files? I'll stay tuned, as you suggested.
Hi Max,
Yes, the plan is for references to other files, or something to that effect to work. In fact they are crucial for some other interesting things to come.
Regards
-- Pantelis
On Sat, Oct 2, 2021 at 3:23 PM Max FERGER @.***> wrote:
Sorry if you misunderstood my post. I'm very happy with YAML and its capabilities! It's more like: although my files contain 2MiB+ of useful data, the representation is still such that I'm able to cope. As are my peers. And the tools we wrote that work on this amount of data aren't slow, either.
I'm looking forward to a "ypath" concept. Probably out of scope: will there be references into other YAML files? I'll stay tuned, as you suggested.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/yaml/yaml-spec/issues/44#issuecomment-932743511, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQGJWRBWB56ZBCD765IBC3UE32VTANCNFSM4I2ZZDTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
What we are considering is having the ability to uses the *... syntax become a general "reference" node. */foo/0/bar - absolute "ypath" reference
Seems nice but ... complex.
For what it's worth, I agree for the common case it would be very nice to 'hoist' anchor definitions to allow them to be put at the end of a file.
I am not sure how often redefining anchors is actually done in practice (and so may be a use case that could be prioritized down and supported some other way (*)), but I can say that forcing you to put anchor definitions at the beginning of files will for sure discourage their use in many cases (as it will push down "below the fold" other content that you'd probably want to emphasize more).
(*) For example, you could perhaps use a different syntax for anchors that are redefined, e.g., &&anchor
and if needed maybe **anchor
.
(*) For example, you could perhaps use a different syntax for anchors that are redefined, e.g.,
&&anchor
and if needed maybe**anchor
.
This makes it impossible to collate YAML documents / snippets without scanning for re-defined anchors in later parts of the text.
With the spoken-of references, you can also hoist to its end. Mock-up example:
a:
b: */hoist/later
c: foo
d: bar
hoist:
later: something we want to write at a later time
... looks quite readable to me.
@UnePierre - yeah, that's pretty readable and nice. I presume this would also work?
a:
b:
<<: */hoist/later
prop2: 30
c: foo
d: bar
hoist:
later:
prop1: 15
@UnePierre - yeah, that's pretty readable and nice. I presume this would also work?
a: b: <<: */hoist/later prop2: 30 c: foo d: bar hoist: later: prop1: 15
That's how I understood https://github.com/yaml/yaml-spec/issues/44#issuecomment-932736889