pyyaml icon indicating copy to clipboard operation
pyyaml copied to clipboard

Will YAML dicts always be "Ordereddict"?

Open papadeltasierra opened this issue 2 years ago • 7 comments

It looks from reading the PyYAML source that YAML dicts are always presented as a Python Ordereddict but this is not explicitly stated anywhere that I can find in the PyYAML documentation. I explicitly require some YAML to be read ordered, and ideally as a dict, so this would be perfect but I cannot write code based on reading the internals of a library package - so if the dicts will always be Ordereddict, can this be explicitly stated somewhere so I can justify relying on this fact? Thanks!

papadeltasierra avatar Aug 30 '22 14:08 papadeltasierra

As a rule, YAML mappings are not ordered. Two YAML documents that differ only in key order are semantically equivalent, and code that processed YAML documents may freely re-order keys. If you need the semantics of an ordered mapping, I'd suggest using a pattern like this:

- foo: 1
- bar: 2
- baz: 3

Thom1729 avatar Aug 30 '22 14:08 Thom1729

Thanks @Thom1729 . Why do you use Ordereddict then instead of a regular dict? Is there some other benefit that I've not twigged? And thanks for the top - that's exactly what I'd done ;-).

papadeltasierra avatar Aug 30 '22 16:08 papadeltasierra

I haven't actually done much work on pyyaml, and I'm not sure why it uses an ordered dict. However, in Python 3.6, the basic dict implementation is ordered anyway, so it doesn't make much difference.

Thom1729 avatar Aug 30 '22 17:08 Thom1729

It looks from reading the PyYAML source that YAML dicts are always presented as a Python Ordereddict

Can you link to the code you are referring to and clarify what you mean with presenting here?

The only location OrderedDict is used is in the representer: https://github.com/yaml/pyyaml/blob/master/lib/yaml/representer.py#L375 And that is used when dumping an OrderedDict in YAML.

perlpunk avatar Aug 30 '22 17:08 perlpunk

@Thom1729 has it right: absent a user/tag override of this behavior, the built-in loaders will always deserialize a YAML mapping as a Python dict, so the behavior you'll see will depend on the default Python dict behavior. In Python 3.6+, those always preserve insertion order (but are not OrderedDict). If you're actually seeing the OrderedDict type come out, it's because something has changed the default behavior (eg, a !!python tag in the input document or a customized loader).

nitzmahone avatar Aug 30 '22 17:08 nitzmahone

... and yeah, to your original question: officially the spec says mapping keys are unordered, though in practice many implementations preserve the input document key order by default. Some relevant discussion necromancy: https://github.com/go-yaml/yaml/issues/30#issuecomment-56230946 - basically there's no plan to change that behavior in pyyaml, but if your documents need to interop with arbitrary other implementations and preserve the ordering semantics regardless of their internal mapping->($dictish_thing) impl, better to use a less-convenient data structure that guarantees order.

nitzmahone avatar Aug 30 '22 18:08 nitzmahone

@perlpunk My mistake. I had not understood the code well enough and believed that the Representer was how PyYAML built up the objects when parsing. I'll stick to the dict as a list method suggested by @Thom1729

papadeltasierra avatar Sep 01 '22 09:09 papadeltasierra