pyyaml icon indicating copy to clipboard operation
pyyaml copied to clipboard

PyYaml can't parse mappings with lists as keys

Open bmccutchon opened this issue 6 years ago • 9 comments
trafficstars

Here is an example of valid YAML from the YAML spec (example 2.11):

? [ New York Yankees,
    Atlanta Braves ]
: [ 2001-07-02, 2001-08-12,
    2001-08-14 ]

However, PyYaml can't parse this:

yaml.safe_load("""? [ New York Yankees,
    Atlanta Braves ]
: [ 2001-07-02, 2001-08-12,
    2001-08-14 ]""")
yaml.constructor.ConstructorError: while constructing a mapping
  in "<unicode string>", line 1, column 1:
    ? [ New York Yankees,
    ^
found unhashable key
  in "<unicode string>", line 1, column 3:
    ? [ New York Yankees,
      ^

This must be a bug, as it violates the spec. It could be fixed by parsing the list as a tuple.

bmccutchon avatar Sep 18 '19 21:09 bmccutchon

So it appears to be parsing correctly, but I'm not aware of a generic way to use unhashable keys in a standard Python mapping (and neither is BaseConstructor, which is why it bombs)... As you point out, a specific hack to use a tuple could probably be used in the list case, but it's not a general solution to the problem. eg, try it with a mapping as a key, or any unhashable Python object ref, and you'll be right back in the same boat.

Someone brought this up a long time ago on a SO thread (https://stackoverflow.com/questions/13538015/sequence-as-key-of-yaml-mapping-in-python), but doesn't look like it really went anywhere.

Off the top of my head, the only thing I can think of would be to create a generic hashable container object for unhashable keys that could contain a reference to the actual data, but I'm not sure how useful that would be in the real-world. You wouldn't be able to use it for lookups against the returned data structure (since the "real" key is an artificial surrogate thing), so it would presumably only be useful by iteration. Would something like that solve your issue?

On a related note: it makes me feel slightly better that ruamel.yaml fails in exactly the same way. ;)

nitzmahone avatar Sep 18 '19 22:09 nitzmahone

(PS, you should be able to subclass XConstructor and override construct_mapping to do the tuple thing or anything else you want today)

nitzmahone avatar Sep 18 '19 22:09 nitzmahone

I did a PR for this last year: #159

perlpunk avatar Sep 19 '19 07:09 perlpunk

[...] that ruamel.yaml fails in exactly the same way.

really? It works for me here with ruamel. The only thing ruamel does not support is nested lists.

edit: here are the test results for that specific YAML: http://matrix.yaml.io/details/M5DY.html#ruamel-py

perlpunk avatar Sep 19 '19 07:09 perlpunk

It would be cool if someone could have a look at my PR (#159). It still might need some work (what if we have circular aliases?), but if there's something wrong in general with it, I'd like to know.

perlpunk avatar Sep 21 '19 10:09 perlpunk

It's a pity this hasn't moved for a while. I think it's as important functionality to correspond to the yaml specifications.

Alexander-Serov avatar Oct 06 '20 16:10 Alexander-Serov

I agree that, at least for the reasonably simple cases presented, we should be able to handle representing list keys as tuples. I'm willing to work on this for v.next- #159 would need a bit of rework around the error handling to be more robust (and a couple more tests), but I think @perlpunk's underlying concept is sound. I'll add it to the planning project.

I think we need to be very clear though about what will work and what won't: this is only about substituting actual list-typed keys with tuples. Any other Python sequence type, if encountered in that situation, will continue to fail. I think the risk of backward-incompatible changes is otherwise pretty low, since the customization to BaseConstructor is limited to construct_mapping, which hardcodes a dict as the mapping type anyway, so anyone that's customized PyYAML to default to anything other than dict has already overridden this method on the constructor anyway.

nitzmahone avatar Oct 06 '20 18:10 nitzmahone

I am currently running in this problem, as I need to parse YAML document containing lists as keys. Is this issue still active? I can see a PR was proposed a long time ago.

TytoCapensis avatar Aug 05 '24 14:08 TytoCapensis

Same here

vanyingenzi avatar Aug 27 '24 14:08 vanyingenzi

Just want to add to the chorus here. @perlpunk and @nitzmahone , if you need a pair of eyes or a little bit of effort, let me know and I'll jump in.

doctorjei avatar Aug 01 '25 04:08 doctorjei