Control characters in data
Hi,
when string fields in model data contains control characters (e.g., \x02, etc.), these characters are written to the XML output. However, such characters are not valid in XML documents according to the XML 1.0 specification (https://www.w3.org/TR/xml/#charsets). I’m not sure how best to handle such cases, as different users may have different requirements. Perhaps a parameter could be added to SerializerConfig to control the behavior. For example, the default behavior could be to strip these characters, with additional option like "raise" (to throw an error).
Hi @nmrtv, you are using the native XmlEventWriter right? I am not sure why but XMLGenerator, if you switch to the LxmlEventWriter you will get a validation error from lxml
ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters
Under the hood, we rely on lxml or pythons sax utils for most of these stuff