yq
yq copied to clipboard
Fix merge anchor traversing
- Allow inline maps instead of just aliases (e.g.
{<<: {a: 42}},{<<: [{a: 42}]}) - Disallow nested sequences (e.g. ~~
{<<: [[{a: 42}]]}~~) - Disallow other types (e.g. ~~
{<<: 42}~~)
Closes #2386
I have a question though: I see in operator_traverse_path_test.go that goccy would reject at least some invalid merge types at parse time, which might make using custom tags like <<: !include file.yaml (see #2386) impossible. Am I right about that? And does that mean that this usage will not be supported in the future? If so, that would be a bit sad.
Wait, this actually doesn't fix explode(.) yet. Converting to draft for now, I'll do that later.
EDIT: nvm fixed it
Ok, the excessive exploding fix I mentioned above may actually cause problems of its own:
echo '{a: {b: &b 42}, <<: {c: *b}}' | yq 'explode(.)' now returns {a: {b: 42}, c: *b}, which also has a dangling alias...
However, I don't really know how to solve this, because it depends on if the mapping node to which the merge anchor refers is also being exploded, e.g. with {r: &r {...}, m: {<<: *r}} we don't want to explode .r, although we might make a copy and then explode it.
And apparently this issue is already present when partially exploding: echo '{a: {b: &b 42}, c: *b}' | yq 'explode(.a)' in the latest release v4.45.4 returns {a: {b: 42}, c: *b}, which also has a dangling alias. Is that a bug or intentional?
Minor comment, need more time to review this and consider if there are any side effects - thanks for the contribution though!
This is really good - I think you have a clearer picture of how merge works better than I do 😅 - I'm going to go over it again later just to double check it as it's a sensitive part of the code.
Thanks for having another look! Did you see my comments in the thread above as well? Currently I've actually broken key overriding in regular maps for traversing, and I don't have a good solution yet...
Ah spelling is also wrong I see, didn't know this was a bri'ish codebase ;)
Edit: apparently the codebase it not entirely consistent, but anyway
Ok I think to properly fix explode I need to take a step further and rewrite the reconstructAliasedMap - which I've done in a branch MakeExplodeGreatAgain and included your extra tests 👍🏼 I've also changed the warning to just be a generic you haven't set the flag warning, rather than trying to specify which maps are affected.
Tempted to merge it into master - it'll no doubt conflict with what you've got here though :/
Ok I think to properly fix explode I need to take a step further and rewrite the reconstructAliasedMap - which I've done in a branch
MakeExplodeGreatAgain
Nice, I'll merge it in here as well
@mikefarah I notice that fixedReconstructAliasedMap in MakeExplodeGreatAgain is lacking a lot of logic that is present in reconstructAliasedMap. Compare:
$ echo '{e: &empty {}, b: &b 42, a: {b2: *b}, <<: *empty}' | ./yq 'explode(.)'
[WARNING] 14:18:06 --yaml-fix-merge-anchor-to-spec is false; causing merge anchors to override the existing values which isn't to the yaml spec. This flag will default to true in late 2025.
{e: {}, b: 42, a: {b2: 42}}
vs
$ echo '{e: &empty {}, b: &b 42, a: {b2: *b}, <<: *empty}' | ./yq --yaml-fix-merge-anchor-to-spec 'explode(.)'
{e: &empty {}, b: &b 42, a: {b2: *b}}
This is because it's not using overrideEntry. It's also not using Context, but I'm actually not sure what the exact purpose of that is.
This may also break aliases (*b is invalid now):
$ echo '{b: &b 42, m: {e: &empty {}, a: {b2: *b}, <<: *empty}}' | ./yq --yaml-fix-merge-anchor-to-spec 'explode(.)'
{b: 42, m: {e: &empty {}, a: {b2: *b}}}
I'm seeing if I can quickly fix it in the merge.
I'm seeing if I can quickly fix it in the merge.
I thought of a way to quickly solve the issue in a different way, by evaluating merge keys first, I hope you don't mind.
I'm now looking at the issue in my code I mentioned above.
Ok, it should be all done now, I hope! Well, maybe the tests could be cleaned up a bit / made more consistent between traverse & explode.
Btw, I thought of some interesting edge cases, where I don't know what the intended way to handle them would be, which again is probably one of the reasons merge anchors are not part of the recent spec anymore: the more I know about them, the less sense it makes to me to use regular keys for behavior like this.
(current behavior in bold)
{<<: {a: 42}, <<: {b: 42}}:{a: 42, b: 42}/{b: 42}? Arguably it should be the latter, since after de-duplication only the second merge key remains{<<: {a: 42}, !!str <<: {b: 42}}:{a: 42, !!str <<: {b: 42}}/{!!str <<: {b: 42}}? I think this is right, since the keys have different types{<<: {a: 42}, !!merge mykey: {a: 37, b: 42}}:{a: 42, b: 42}/{a: 37, b: 42}/invalid?
Not that other YAML tools handle them in a good way; yamllint.com with input {<<: {a: 42}, !!str <<: {b: 42}} gives {a: 42, b: 42} (as did yq previously; with explode, not traverse).
Ah the reason I was processing like that was to preserve key order on merging. Not a big issue though, this is looking good! Thanks for PR!
Hmm - only problem with the way you're processing explode is that they key order is mangled if merge is processed first :/ Not a massive issues I guess...though I might see if I can fix that
True, although YAML does act like merge keys come first, so it kinda makes sense in a way
Hey @mikefarah, I'm really confused because it seems like something went really wrong merging this; a lot of changes didn't make it and some logic from MakeExplodeGreatAgain that I changed it in this MR was merged. E.g.
fixedReconstructAliasedMapre-appeared!!str <<is broken again- Inline merge anchors are broken without
--yaml-fix-merge-anchor-to-specwhile I fixed those for both cases - Nested merge anchors (
{<<: {<<: {a: 42}}}) are broken even with the fix - Probably more
Yeah, I brought those changes back in to ensure the correct map key order. Afaik I included all the tests you added; so there's shouldn't be any gaps :/ - but maybe something went awry for those cases.
Inline merging is under the flag, that's intentional - my thinking is that it's best to leave the old way of merging as untouched as possible seeing the new way is a complete re-write.
Not sure what !!str << exactly refers to? Will take a look at nested inline merge anchors when I get a chance - though it seems pretty niche; not sure what the use-case would be?
@mikefarah I'm a bit sad to see many things that worked with this MR not work anymore. I put in a lot of work to try and get everything right, only to see most of it break again. Sorry if that sounds to harsh, I'm still really glad you're maintaining this useful tool and actively looking at issues and comments. Anyway, more specifically, it still doesn't work anymore for what I was trying to do originally, also with the flag. (Specifically because of nested inline merge anchors.)
I think that aside from the order, which I understand you might want to keep the same but shouldn't impact the meaning, I made sure to keep everything compatible with previous behavior with and without the flag. I also think that I kept the order the same as before when the flag was not specified (although I'm not 100% sure of the top of my head); maybe you can check this, and in that case, can't the other fixes just be included with and without the flag? I don't think those could break previous code.
Regarding behavior with the flag, fixedReconstructAliasedMap bypasses all original fixes from this MR, so it doesn't handle inline maps in merge anchors, and has some other shortcomings.
I think that not all tests are ran for both with & without the flag anymore where they could be. Maybe that's why it seems to be working fine.
!!str << is not a merge key, it's just a regular key that happens to be called <<.
I could make a PR to fix the things that are broken now and add tests for those, but the thing is; this was that PR :/