jackson-dataformat-xml icon indicating copy to clipboard operation
jackson-dataformat-xml copied to clipboard

@JsonAnySetter Mangles Nested Xml Elements and Xml Attributes During Serialization

Open dudleycodes opened this issue 2 years ago • 4 comments

I apologize in advance if there is already a bug opened for this condition (or if I am approaching this the wrong way) - my search-fu failed to turn up anything similar.

I have a situation where I need deserialize XML to a POJO, and at the end of it all I need to serialize the request back to XML and return it as part of the response. I only need to process a portion the fields. I'd rather not map out all unused fields on the POJO, as this would require constantly updating the application as the end-user's generation of the request schema grows over time.

Describe the bug

I'm attempting to use @JsonAnySetter to capture the unmapped fields during deserialization, and @JsonAnyGetter to serialize them on the way back out.

It works fine for simple values (e.g. <some-unmapped-field>hello</some-unmapped-field>. But when the unmapped field includes attributes (.e.g. <some-unmapped-field id = "one">hello</some-unmapped-field> or nested XML elements (e.g. <some-unmapped-field><a>1</a><b>2</b></some-unmapped-field>, the serialized output becomes completely mangled.

Version information

com.fasterxml.jackson.core:jackson-annotations:2.13.3 com.fasterxml.jackson.dataformat:jackson-dataformat-xml:2.13.3

To Reproduce

The POJO:

@Getter
@Setter
public class SomePojo {
     @JsonProperty("id")
     String id;

     /// Other mapped fields/elements....

    @JsonAnyGetter
    @JsonAnySetter
    @JsonUnwrapped
    Map<String, Object> others = new HashMap<>();
}

The input XML:

<some-pojo>
        <id>123</id>
        <unmapped-element>
            <e uid = "1">one</e>
            <e uid = "2">TWO</e>
            <e uid = "3">3</e>
        </unmapped-element>
</some-pojo>

The output XML:

<some-pojo>
     <id>123</id>
     <unmapped-element>
     <e>
                <uid>1</uid><>one
            </>
        </e>
        <e>
            <uid>2</uid><>TWO
        </>
    </e>
    <e>
        <uid>3</uid><>3
    </>
</e>
</unmapped-element>
</some-pojo>
  1. Brief code sample/snippet: include here in preformatted/code section
  2. Longer example stored somewhere else (diff repo, snippet), add a link
  3. Textual explanation: include here

Expected behavior

I would expect the attributes/elements to be kept through the deserialization / serialization process.

Additional context

  • I have tried other values for the underlying map, including String (output is empty) and com.fasterxml.jackson.databind.JsonNode (throws JsonMappingException).
  • I've tried @JsonRawValue, but from what I've read that only affects serialization, and has no effect on deserialization. This has been consistent with my experiments.
  • I've tried writing a custom deserializer, but the underlying JsonParser seems too-JSON oriented, ultimately requiring the input XML to cleanly map to JSON - my guess would be this is why I'm having this issue in the first place.

Thanks for looking!

dudleycodes avatar Aug 18 '22 18:08 dudleycodes

Ok. First of all: yes, your use case makes sense, and your approach as well.

Unfortunately round-tripping of XML without intermediate POJO is difficult and specifically there is no way to retain "attribute-ness" of input. As such it is not possible to retain that aspect currently, without having a POJO that indicates which properties would be serialized as attributes. With 2.14 it will be possible to use JsonNode instead of Map<> but that probably will not solve these issues (and specifically does nothing wrt attribute-ness).

Having said that those empty tags (<>) look very odd, and I don't understand them.

One possible improvement, conceptually, would be to add some sort of "Native XML" value type, to which XML subsections could be mapped, and that would serialize as XML. I don't have a good idea of how it'd work, problem being that XmlParser sort of converts XML into tokens; this type should ideally by-pass that layer. But if such type existed it would help here.

Other than that we could try to at least figure out how to get rid of "empty tags". For that a minimal unit test would be nice, to reproduce the problem more concisely. That would not solve the problem of losing attibute identity but could at least retain elements and textual contents.

cowtowncoder avatar Aug 19 '22 02:08 cowtowncoder

Thanks for the quick reply and all you do with this project!

Some sort of NativeXML type, or even a @RawXmlValue annotation that could be applied to a Map<String, String> seems like an ideal solution to my use-case.

For now, I'll plan to intercept the raw XML input (before it goes through any parser) and echo its contents back via the response to the user (will require a CWE encoder to prevent abuse). I dislike the idea of this approach, so in the future, I'll see if I can't give the custom deserializer another shot, and post the code to this issue.

dudleycodes avatar Aug 19 '22 13:08 dudleycodes

Native type would have to be value of Map<String, NativeXml> or possibly just having native type be composable. Still, I don't have an immediate idea of how it could be implemented, just the concept. But at least it is good to have this as an idea, sometimes solutions surface over time when I go over old issues.

I agree that in the meantime it will be necessary to use a work-around, and that may indeed be sub-optimal.

cowtowncoder avatar Aug 20 '22 23:08 cowtowncoder

Realized this is in wrong repo, will transfer.

cowtowncoder avatar Jan 27 '24 05:01 cowtowncoder