jackson-dataformat-xml icon indicating copy to clipboard operation
jackson-dataformat-xml copied to clipboard

`XmlMapper` output not well-formed when Object keys use invalid XML name characters

Open rlbns opened this issue 3 years ago • 1 comments

I've been working with a lot of web APIs and converting the JSON to XML using XmlMapper. It generally works great, but I have an example that creates invalid XML. I am using the latest version of Jackson (2.13.1).

When you call this API you get the attached JSON output. https://world.openfoodfacts.org/api/v0/product/7622300315733.json

The Jackson pretty-printer has no problem indenting this nicely. BadXmlFromJson.zip

After converting it to XML (also in the attached ZIP) you get some issues that prevent Xerces or Saxon from parsing it:

  1. Namespace prefixes are not declared
  2. Some elements are not well formed

For example, at line 48 you see this: <agribalyse_proxy_food_code:en>12315</agribalyse_proxy_food_code:en>

and at line 653 you see this: <1> which is an error because XML names must start with a letter or underscore.

I'm using XmlMapper like this:

		XmlMapper xmlMapper = new XmlMapper();
		return xmlMapper.writeValueAsString(jsonTree);

I can't find any features in FromXmlParser.Feature or ToXmlGenerator.Feature that apply to either of these issues.

Is there another way to configure XmlMapper or is this a bug?

rlbns avatar Jan 12 '22 17:01 rlbns

It'd be nice to have test inlined here, instead of as a zip archive. But aside from that yes, there is a problem in using non-XML-name characters in Map keys or as POJO property names -- names will be used as-is, since XML has no mechanism for escaping name characters, so there is no way to safely translate names to contain such characters.

However, #531 (added in 2.14.0) does add a mechanism that might work: it will allow translation using convention which does allow avoiding this problem.

cowtowncoder avatar May 29 '23 20:05 cowtowncoder