xmlservice icon indicating copy to clipboard operation
xmlservice copied to clipboard

Properly escape content in built CDATA blobs

Open NattyNarwhal opened this issue 1 year ago • 1 comments

xmlservice can include characters invalid in XML. For example, EBCDIC substitution characters (3F), such as stored in a database or if the new encoding can't represent the character, are turned into ASCII ones (1A), which are not allowed characters in XML (all ASCII control characters except newline are forbidden), even in a CDATA block. Some ideas:

  • If we're outputting to Unicode, put a Unicode replacement character (U+FFFD) in instead. I believe XML allows these. Unicode output isn't guaranteed though.
  • Silently drop the characters or replace with i.e. a space. Would require a scan of the buffer before appending to build the CDATA block. Might be surprising for users.
  • Don't use CDATA, always use entities to escape XML special or disallowed characters. Complicates building the string.

Basically, the clients should always get back a valid XML blob that doesn't need special handling before parsing.

Related to zendtech/IbmiToolkit#178

NattyNarwhal avatar Feb 15 '24 16:02 NattyNarwhal

I think the current workaround for this is set hex='on', however that requires double the space. I'd prefer some kind of base64 encoding instead.

I also think the CDATA is not a good workaround for XMLSERVICE since if the field contains XML then you could end up with nested CDATA and I don't know how that's handled. I think escaping is the better way in all cases.

kadler avatar Feb 15 '24 16:02 kadler