ox icon indicating copy to clipboard operation
ox copied to clipboard

Ox.load fails with UTF-8 characters in xml element names

Open mzientkowski opened this issue 1 year ago • 3 comments

Ox.load fails with UTF-8 characters in xml element names in hash mode

xml = "<ń>TEST</ń>"
Ox.load(xml, mode: :hash)
EncodingError: invalid symbol in encoding US-ASCII :"\xC5\x84"

mzientkowski avatar May 25 '23 15:05 mzientkowski

The default encoding for strings in Ruby is or was US-ASCII. If you want UTF-8 then use a proper XML prolog to set the encoding. <?xml version='1.0' encoding='utf-8'?> should take care of that.

ohler55 avatar May 25 '23 16:05 ohler55

xml prolog has no effect

xml = "<?xml version='1.0' encoding='utf-8'?>\n<ń>TEST</ń>"
Ox.load(xml, mode: :hash)
EncodingError: invalid symbol in encoding US-ASCII :"\xC5\x84"

it fail only with mode: :hash

mzientkowski avatar Sep 12 '23 06:09 mzientkowski

I'll take a look at how :hash is different.

ohler55 avatar Sep 13 '23 12:09 ohler55

@ohler55 I tried to implement a fix to this issue in the mentioned PR, I'll be using this fork in my project for now, if there is anything to add / modify before you consider merging it, please feel free to tell me.

Uelb avatar Mar 21 '24 11:03 Uelb