Serializing `Map`s will produce non-well-formed XML as keys are not modified as per XML name rules
Hello,
I can not parse Map to valid XML String if Map contains a key with white space.
I am using jackson-dataformat-xml version: 2.5.3
Here is sample code to reproduce this issue:
package com.asite.html;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import com.fasterxml.jackson.core.JsonParseException;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.JsonMappingException;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;
public class Test {
public static void main(String[] args) throws IOException{
parseMapWithNoSpace(); //This method works fine
parseMapWithSpace(); //This method will throw IOException
}
private static void parseMapWithNoSpace() throws JsonProcessingException, IOException, JsonParseException, JsonMappingException {
Map<String, String> mapWithNoSpace = new HashMap<>();
mapWithNoSpace.put("mapWithNoSpace", "test test"); //Key with no space
XmlMapper mapper = new XmlMapper();
String xml = mapper.writeValueAsString(mapWithNoSpace);
System.out.println(xml);
Map map1 = mapper.readValue(xml, Map.class);
System.out.println(map1);
}
private static void parseMapWithSpace() throws JsonProcessingException, IOException, JsonParseException, JsonMappingException {
Map<String, String> mapWithSpace = new HashMap<>();
mapWithSpace.put("tests pace", "test"); //key with space
XmlMapper mapper = new XmlMapper();
String xml = mapper.writeValueAsString(mapWithSpace);
System.out.println(xml);
Map map1 = mapper.readValue(xml, Map.class);
System.out.println(map1);
}
}
Here is console output
<HashMap xmlns=""><mapWithNoSpace>test test</mapWithNoSpace></HashMap>
{mapWithNoSpace=test test}
<HashMap xmlns=""><test space>test</test space></HashMap>
Exception in thread "main" java.io.IOException: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,30]
Message: Attribute name "space" associated with an element type "test" must be followed by the ' = ' character.
at com.fasterxml.jackson.dataformat.xml.util.StaxUtil.throwXmlAsIOException(StaxUtil.java:24)
at com.fasterxml.jackson.dataformat.xml.deser.XmlTokenStream.next(XmlTokenStream.java:164)
at com.fasterxml.jackson.dataformat.xml.deser.FromXmlParser.nextToken(FromXmlParser.java:453)
at com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringMap(MapDeserializer.java:461)
at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:342)
at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:26)
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3562)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2578)
at com.asite.html.Test.parseMapWithSpace(Test.java:33)
at com.asite.html.Test.main(Test.java:15)
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,30]
Message: Attribute name "space" associated with an element type "test" must be followed by the ' = ' character.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:601)
at javax.xml.stream.util.StreamReaderDelegate.next(StreamReaderDelegate.java:88)
at org.codehaus.stax2.ri.Stax2ReaderAdapter.next(Stax2ReaderAdapter.java:129)
at com.fasterxml.jackson.dataformat.xml.deser.XmlTokenStream._collectUntilTag(XmlTokenStream.java:349)
at com.fasterxml.jackson.dataformat.xml.deser.XmlTokenStream._next(XmlTokenStream.java:312)
at com.fasterxml.jackson.dataformat.xml.deser.XmlTokenStream.next(XmlTokenStream.java:162)
... 8 more
You do realize that spaces are not allowed in XML element names? Content like:
<test space>
is not well-formed XML, as "space" here looks like an attribute (space there ends element name), and it is missing value.
So, what you have is invalid input.
@cowtowncoder
I am not generating this XML, XmlMapper does. This is my question, why XmlMapper is generating invalid XML. I can put any String in Map as key.
I am getting this issue while RestTemplate .
ie.
Client:
ResponseEntity<Map> response = RESTTEMPLATE.postForEntity(INITIALURL + "/getMap", templateParamRequest, Map.class);
Service:
Map<String, String> map = new HashMap<>();
map.put("test space", "test")
return ResponseEntity<>(map, HttpStatus.OK);
I don't understabd why did you close this issue.
@nitinvavdiya If so, please explain that in description: it emphasized problem of reading (deserialization), not generation. Failure to read would be due to invalid XML.
You are also reporting this against ancient version (2.5.3): it should be verified against a later version; at very least latest patch for 2.5.x; but given that 2.5 has not been supported for over 2 years, should be reproduced against 2.8(.9). Based on problem itself, however, I think this does still happen. This is not always the case for all bugs; there have been dozens of other fixes to XML module.
I will go ahead and modify the issue title with information.
This issue is more severe than it looks. Jackson cannot serialise the following simple Java Data Structures into valid XML:
- Map<LocalDate, Object>
- Map<Integer, Object>
- Map<Double, Object>
etc. The issue being that XML Tags are not allowed to start with a number.
The same problem is acquiring with ObjectId (can start with number). Serialization generates xml ignoring XML format doc.
@benchdoos problem with ObejctIds should be reported as separate issue: this is just for escaping of Map key values.
I am not sure what could be done here: problem is that there is no way to encode non-name characters (in XML sense) into something that can be read back -- escaping by entities only works for character data, not names. As such there are really only bad options:
- Try to validate, throw exception on input that can not be written
- Replace invalid characters with something else (or remove): results in well-formed XML, but changes content
at least without changing encoding.
Conversely it would be possible to use different style of XML structure for values (like <key> and <value> entries; or write key as attribute). But this would break backwards compatibility.
JAXB seems to use <map><entry><key>...</key><value>...</value></entry>...</map> structure, which makes sense in that sense.
Since this change would be backwards incompatible maybe it could be hidden behind a configuration setting.
Conversely it would be possible to use different style of XML structure for values (like
<key>and<value>entries; or writekeyas attribute). But this would break backwards compatibility.
Sure, but JAXB should throw exception, or returning warn logging (might be in config), that this kind of key is not available in XML specs and is invalid. But it doesn't. That's the point.
@benchdoos I am not saying behavior here is good, just enumerating challenges, trying to think through possible solutions.
But I guess one possible option would be to try to add/enable output-side validation for names since I agree that producing non-well-formed content is the worst of choices.
I'll see if Stax2 has options to force validation.
Looks like code like:
WstxOutputFactory staxF = new WstxOutputFactory();
staxF.setProperty(WstxOutputProperties.P_OUTPUT_VALIDATE_NAMES, true);
XmlFactory f = XmlFactory.builder()
.outputFactory(staxF)
.build();
final ObjectMapper MAPPER = mapperBuilder(f).build();
does catch write, although exception being thrown:
com.fasterxml.jackson.core.JsonGenerationException: Illegal name character ' ' (code 32)
is not really optimal. Also, unfortunately not Stax2 API property so would not work with Aalto.
It is possible to make XML module see if the property is supported, enable if so. Need to think whether it makes sense.