SXML
SXML copied to clipboard
Include special sections in Document_Type
Right now, we ignore special sections like comments, processing information, DOCTYPE etc. in the parser. As a consequence, we cannot recreate input documents exactly. If we had this, we could check whether our parsed files from the bulk tests are recreated precisely by matching them character-wise.
Our handling of CDATA will also be an issue here - we translate and escape it into content nodes on the fly. This makes reproducing a CDATA section from content impossible.