pyyaml icon indicating copy to clipboard operation
pyyaml copied to clipboard

dump with encoding="utf8" produces "latin"-encoded bytes

Open cknoll opened this issue 2 years ago • 1 comments

I have just observed the following behavior, which I think is wrong or at least quite unexpected:

In [1]: import yaml

In [2]: yaml.dump("ä", encoding=None)
Out[2]: '"\\xE4"\n'

In [3]: yaml.dump("ä", encoding="utf8")
Out[3]: b'"\\xE4"\n'

In [4]: yaml.dump("ä", encoding="latin")
Out[4]: b'"\\xE4"\n'

In [5]: "ä".encode("latin")
Out[5]: b'\xe4'

In [6]: yaml.__version__
Out[6]: '6.0'

My python version is 3.8.6.

cknoll avatar Jan 27 '22 13:01 cknoll

Update: I found that the option allow_unicode=True triggers the encoding-option to have a (predictable) effect. However, I think this should be mentioned in the docs: https://pyyaml.org/wiki/PyYAMLDocumentation

Also there seem to be processing errors in the generated docs, leading to !!str, !!python/str etc.

cknoll avatar Jan 27 '22 13:01 cknoll