pyyaml icon indicating copy to clipboard operation
pyyaml copied to clipboard

I don't want Anchors Point. What should I do?

Open qwerplf opened this issue 4 years ago • 10 comments

When using yaml. dump(), the output yaml file contains anchor points, which is unacceptable to me. What should I do? image image

qwerplf avatar Jul 13 '21 08:07 qwerplf

You are going to have to accept anchor and aliases. The anchor "&id001" at "labels" is saying "Here's some data which we will want to refer to later", and the alias "*id001" is saying "Yes, here's the same data as at &id001".

Without them, the value for "matchLabels" would be null, which would probably break the system you are using.

peterkmurphy avatar Jul 13 '21 23:07 peterkmurphy

ref: https://github.com/yaml/pyyaml/issues/103

xiaket avatar Aug 06 '21 08:08 xiaket

Your link doesn't change the situation that your YAML has the same piece of data referred to three times, and thus needs anchors and aliases to refer to them.

Can I ask: why are anchors and aliases unacceptable for your use case?

peterkmurphy avatar Aug 09 '21 08:08 peterkmurphy

In my case, the yaml file will be parsed by either Cloudformation or a third party library, the anchor support varies and it certainly does not make sense to have those anchors/aliases in the file.

I end up applying the trick describe in the link and simplified the solution like this:

yaml.Dumper.ignore_aliases = lambda self, data: True
return yaml.dump(self.template, Dumper=yaml.Dumper)

But I feel this is still somehow awkward to have to change the default behaviour of dump in this way.

xiaket avatar Aug 09 '21 09:08 xiaket

That may work for you. What does the resulting YAML look like? Hopefully, you haven't lost data in the process. If you have copies of the original data at the "labels" and "matchLabels" parts, then you should be safe.

peterkmurphy avatar Aug 09 '21 11:08 peterkmurphy

@peterkmurphy thanks, it works like a charm.

However, to be honest I'm not happy the way PyYAML handles the anchors and aliases. By default they are on which may make sense to some folks but not me.

xiaket avatar Aug 09 '21 12:08 xiaket

I also have the same problem, Anchors And Aliases breaks my code and I don't know how to delete them.

MatteoSid avatar Mar 29 '22 15:03 MatteoSid

For my use case, it makes it difficult to read the output, especially when you are trying to debug the output of a python script. To me, this feature seems redundant, as you would normally be compressing with brotli/gzip when needed.

edit: Here's my workaround for this, bit hacky but it works ™️ :

    output = yaml.safe_dump(
        json.loads(
            json.dumps(data)
        )
    )

AGhost-7 avatar Jun 22 '22 17:06 AGhost-7

An alternative I prefer to the solution given by @xiaket

I end up applying the trick describe in the link and simplified the solution like this:

yaml.Dumper.ignore_aliases = lambda self, data: True
return yaml.dump(self.template, Dumper=yaml.Dumper)
class VerboseSafeDumper(yaml.SafeDumper):
    def ignore_aliases(self, data):
        return True

yaml.dump(spec, sys.stdout, Dumper=VerboseSafeDumper)

Which gets you a safe sub-set of the yaml outputters group without converting to/from json for each fragment of the document like in the above comment. It's a little more obvious in my opinion what's going on as well.

As nice as it would be to set this option as a flag in the top-level entry points of the library, as a user my take is that bailing out of anchors is a little extreme, so something like this is fine.

yurisich avatar Oct 27 '22 14:10 yurisich

At the risk of invoking all manner of coder related flaming.... print(yaml.safe_dump(json.loads(json.dumps(data))))

slimeandsoakem avatar May 02 '24 18:05 slimeandsoakem