jackson-dataformats-text icon indicating copy to clipboard operation
jackson-dataformats-text copied to clipboard

Questionable parsing/serialising of YAML ordered maps

Open dpetroff opened this issue 2 years ago • 3 comments

Ordered maps were introduced in YAML 1.1 and the spec discourages considering key order to be significant otherwise:

object: !!omap
- property: "value"

With jackson 2.13.3, this is parsed as a Map<Object, List<Map>> while I had expected it to be parsed as a Map<Object, LinkedHashMap> instead. Besides being a bit odd and unexpected, serialising the parsed object produces

object:
- property: "value"

Notice the missing !!omap in the output.

I don't know how I would want to explicitly request !!omap serialisation - I'm certainly not suggesting clobbering the key parse order with HashMap when !!omap was not explicitly specified just so that LinkedHashMap suddenly means !!omap, even though that approach would technically be compliant with the spec. Perhaps a configuration flag to select YAML 1.0 or YAML 1.1+ serialisation of LinkedHashMap would make sense? And maybe wrapper/delegate map objects to explicitly override the !!omap type for a specific LinkedHashMap instance regardless of the flag?

dpetroff avatar Sep 20 '22 21:09 dpetroff

Hi @dpetroff! Right, this because type omap is not recognized currently; I wasn't aware of this type, and code will essentially ignore it.

Unfortunately I am not sure if this can even be handled, as streaming parser/generator do not have concept of structured types -- they operate on token streams, and actual Object binding operation is handled at format-agnostic databind layer. So the problem is that distinction between ordered Maps (like LinkedHashMap) and others is not known when decoding or encoding YAML content. Other type tags (ones that are handled) are for scalar types and those coercions are easier to support.

It seems that there are at least 2 aspects here, too, reading (deserialization) and writing (serialization). Both seem challenging to support.

I do have one question on this, however:

this is parsed as a Map<Object, List<Map>> while I had expected it to be parsed as a Map<Object, LinkedHashMap> 

I am not quite sure how the structure would change here? Jackson defaults to using LinkedHashMap in general for all Object types, but in this case is the structure different too? (Map with List values vs Map with Map values) That would seem odd, given that SnakeYAML should produce proper structure no matter what.

cowtowncoder avatar Sep 20 '22 23:09 cowtowncoder

A single-property object was not the best example. The parser seems to treat !!omap values as a list of single entry maps: {object=[{property1=value1}, {property2=value2}]}

So rather than parsing the !!omap as a single LinkedHashMap containing all the properties in order, it gives you an ArrayList containing a separate "singleton" LinkedHashMap for each property.

dpetroff avatar Sep 21 '22 13:09 dpetroff

Hmmmh. It is odd, then, that SnakeYAML doesn't seem to do the translation to such a structure at token level. At least it sounds like it won't.

cowtowncoder avatar Sep 22 '22 00:09 cowtowncoder