tikxml icon indicating copy to clipboard operation
tikxml copied to clipboard

Support charsets other than UTF-8

Open reline opened this issue 4 years ago • 4 comments

reline avatar Nov 26 '20 22:11 reline

Thanks, I will take a look at this tomorrow or first days of January.

Just by scrolling over the code changes your PR looks good. Will check more in detail.

Thanks for your contrinution and help 👍

sockeqwe avatar Dec 30 '20 09:12 sockeqwe

Overall looks good to me, the real question to me though is if we should set the charset in the builder up front or read the XML declaration encoding property: <?xml version="1.0" encoding="UTF-8"?> and then decide which charset should be used. Even less sure how it should work best for writing ...

sockeqwe avatar Jan 08 '21 17:01 sockeqwe

Overall looks good to me, the real question to me though is if we should set the charset in the builder up front or read the XML declaration encoding property: <?xml version="1.0" encoding="UTF-8"?> and then decide which charset should be used. Even less sure how it should work best for writing ...

I had the same thought, but wasn't sure if that was too out of scope.

I think I would approach it by using the default UTF-8 encoding as it already is to read and write, also allow an encoding to be set for reading and writing (like I've done here), then finally only when reading XML if an encoding property exists it would use that encoding instead and disregard whatever was set in the config. The only question is should we allow clients to override the encoding property when reading in case it is incorrect?

This could be interpreted as a breaking change as well, since currently all files are read using UTF-8 encoding, whereas this would start reading files using the encoding declared in the file.

reline avatar Jan 08 '21 17:01 reline

@Bodo1981 what do you think about adding support for this in general?

sockeqwe avatar Jan 20 '21 11:01 sockeqwe