tikxml
tikxml copied to clipboard
Support charsets other than UTF-8
Thanks, I will take a look at this tomorrow or first days of January.
Just by scrolling over the code changes your PR looks good. Will check more in detail.
Thanks for your contrinution and help 👍
Overall looks good to me, the real question to me though is if we should set the charset in the builder up front or read the XML declaration encoding property: <?xml version="1.0" encoding="UTF-8"?> and then decide which charset should be used. Even less sure how it should work best for writing ...
Overall looks good to me, the real question to me though is if we should set the charset in the builder up front or read the XML declaration encoding property:
<?xml version="1.0" encoding="UTF-8"?>and then decide which charset should be used. Even less sure how it should work best for writing ...
I had the same thought, but wasn't sure if that was too out of scope.
I think I would approach it by using the default UTF-8 encoding as it already is to read and write, also allow an encoding to be set for reading and writing (like I've done here), then finally only when reading XML if an encoding property exists it would use that encoding instead and disregard whatever was set in the config. The only question is should we allow clients to override the encoding property when reading in case it is incorrect?
This could be interpreted as a breaking change as well, since currently all files are read using UTF-8 encoding, whereas this would start reading files using the encoding declared in the file.
@Bodo1981 what do you think about adding support for this in general?