pyyaml icon indicating copy to clipboard operation
pyyaml copied to clipboard

empty string key is considered as complex mapping key

Open guoqiao opened this issue 4 years ago • 3 comments

In yaml spec, ? (question mark and space) is used to indicate complex mapping key, e.g.: list, dict. However, when empty str ("") is used as key in yaml, it's also dumped with ? , which seems not reasonable.

To reproduce in ipython:

In [1]: from yaml import dump                                                                                                  
In [2]: dump({"": "value"})                                                                                                    
Out[2]: "{? '' : value}\n"

I know it seems weird to use "" as key, but unfortunately it's widely used in one of our main services in production for long time, it's nearly impossible to change that. At least empty str is valid key in both yaml and python, it should be considered as a normal string key, in theory.

guoqiao avatar May 18 '20 21:05 guoqiao

I noticed that ruamel.yaml doesn't do this:

>>> import ruamel.yaml
>>> print(ruamel.yaml.dump({'': {'': 'foo'}}, default_flow_style=False))                                                                                                                                                                 
'':
  '': foo

I couldn't find an option in PyYaml to have it behave in a similar way.

akaihola avatar Aug 05 '20 19:08 akaihola

I took some time to debug this behavior and found the issue. I will open a PR right now to improve it.

Thanks @akaihola for tagging the ruamel code, it helped a lot on debugging by comparison.

Seems like on ruamel.yaml they are checking for the existence of event.style attribute, and on PyYaml we are not doing this check:

ruamel.yaml:

def check_simple_key(self):
        (...)
        return length < self.MAX_SIMPLE_KEY_LENGTH and (
            isinstance(self.event, AliasEvent)
            or (isinstance(self.event, SequenceStartEvent) and self.event.flow_style is True)
            or (isinstance(self.event, MappingStartEvent) and self.event.flow_style is True)
            or (
                isinstance(self.event, ScalarEvent)
                # if there is an explicit style for an empty string, it is a simple key
                and not (self.analysis.empty and self.style and self.style not in '\'"')
                and not self.analysis.multiline
            )
            or self.check_empty_sequence()
            or self.check_empty_mapping()
        )

pyaml:

def check_simple_key(self):
        (...)
        return (length < 128 and (isinstance(self.event, AliasEvent)
            or (isinstance(self.event, ScalarEvent)
                    and not self.analysis.empty and not self.analysis.multiline)
            or self.check_empty_sequence() or self.check_empty_mapping()))

athossampayo avatar Nov 14 '21 19:11 athossampayo

I don't know who I can tag to ask for a review on the PR, as it's very small changes 😕

athossampayo avatar Mar 28 '22 13:03 athossampayo