jackson-dataformats-text icon indicating copy to clipboard operation
jackson-dataformats-text copied to clipboard

More configurable TOML serialization

Open yawkat opened this issue 3 years ago • 11 comments

Right now, the toml serializer generates only top level properties, and inline tables where necessary (i.e. where arrays are used):

abc.foo = 1
abc.bar = 2
abc.xyz = [{foo = 1, bar = 2}]

There are special cases where we could generate normal toml tables. The example above can be expressed as:

[abc]
foo = 1
bar = 2

[[abc.xyz]]
foo = 1
bar = 2

This format will in many cases be more readable. However, there are problems with non-inline tables that prevent us from emitting them in a streaming generator:

  • Once a table is started, it is impossible to go back to the parent or root table. If an object has scalar properties that we only see after we started a subtable, we have a problem.
  • Array tables only work with arrays of objects. If an array is heterogeneous, but we've already started emitting it as an array table, we have a problem.

These problems can only be solved with knowledge of the tree being serialized, and perhaps even by influencing property order so that scalars always come first. However, because I expect serialization to toml to be fairly niche – it is a configuration format meant to be written by humans, after all – it is not worth adding machinery to databind to implement this.

Another question to consider: What even is the best representation of an object? A short inline object may be more readable than starting a new table that includes the entire path to that object.

A possible solution to this problem would be to add API to the TomlGenerator to specifically start a table. One approach would be to create overloads for writeStartObject and writeStartArray that allow forcing generation as a table. If data is then passed that cannot be represented as a table (the bullet points above), we would error. While we're at it, could also allow emitting comments with additional methods.

We could also inspect annotations on the forValue passed to writeStartObject, though this seems like misuse of that parameter, so not a good idea.

yawkat avatar Apr 09 '21 09:04 yawkat

Good questions. Streaming writer can still buffer all of content and only emit things at the end (Properties backend does this), but whether that makes sense is an open question.

I suspect that users will be asking for this in one particular case, for what that is worth: when modifying an existing TOML document.

I agree that while annotations might be nice way, they cannot be accessed at streaming backend and would need to be somehow passed by databind. I have thought about this a bit wrt YAML output: there are various styles for textual content (no less than... five variations); no good ideas yet on how those should be passed. In case of YAML, custom String serializer could be defined; but then the issue becomes that of annotation handling. The only module that really supports custom annotations quite extensively is XML module, and it is bit problematic.

One possibility, I think, would be an option that would force use of Tables but also assume strict ordering -- such that if closed table is "re-written", it'd throw exception. This could work in read-modify-write cycle where ordering is preserved (f.ex via JsonNode), or at least statically forced (POJOs). That might not be a bad option, I think, with appropriate warnings on feature used to enable it?

cowtowncoder avatar Apr 10 '21 00:04 cowtowncoder

hope to support table

qiyuey avatar Aug 14 '22 10:08 qiyuey

PRs welcome!

cowtowncoder avatar Aug 15 '22 00:08 cowtowncoder

tables and nested tables is one of the basics

sysmat avatar Nov 22 '22 13:11 sysmat

@sysmat Yes, as I said, PRs welcome. Arguing about usefulness of something does little to implement said feature.

cowtowncoder avatar Nov 22 '22 17:11 cowtowncoder

What is the status of this?

I started trying to use jackson toml support and when I write presently it seems to output things in format similar to the above mentioned like

abc.foo = 1 abc.bar = 2 abc.xyz = [{foo = 1, bar = 2}]

But was expecting the later

[abc]
foo = 1
bar = 2

[[abc.xyz]]
foo = 1
bar = 2

So is the ability to add a "table" (or whatever the toml nomenclature is - still new to it) not available in Jackson and is it related to this?

ebresie avatar Feb 26 '23 15:02 ebresie

Would adding some sort of annotation be a way forward? Say a @ Table(header="abc") which could be applied to a given java class?

Although I suppose if multiple "Table" were added, that might have to be applied to each attribute to allow given object to group related items into the same or different sections.

ebresie avatar Feb 26 '23 15:02 ebresie

Challenge with format-specific annotations is that they cannot be supported by jackson-databind, which guides mapping from properties (Java object) to format events. So typically annotations need to have more general applicability. There are some exceptions -- XML module has a few that operate at low enough level to change token streams -- and so annotation support for format-specific things need to work at level beyond databinding.

But it is also possible that no annotations were needed and it is just a question of making use of existing naming conventions and re-construct output. This is what "Properties" backend does.

So I think it may be just that output side was left at minimum support level, not due to specific limitations.

cowtowncoder avatar Feb 26 '23 18:02 cowtowncoder

a good first step would be to support it in the generator, if anyone is interested in writing a pr. it's not on my roadmap atm.

yawkat avatar Feb 26 '23 18:02 yawkat

Challenge with format-specific annotations is that they cannot be supported by jackson-databind, which guides mapping from properties (Java object) to format events.

So typically annotations need to have more general applicability. There are some exceptions -- XML module has a few that operate at low enough level to change token streams -- and so annotation support for format-specific things need to work at level beyond databinding.

When working with JPA there are annotations to identify tables, columns, ids, etc. ,

With xml annotations in JAXB there are annotations like xml root, xml elements, and xml attributes.

Would either of these be examples to build off of for possible annotation development here?

ebresie avatar Feb 27 '23 20:02 ebresie

What I am trying to say, as is @yawkat, is that the output side of TOML needs work even before considering need for new annotations.

As to JAXB, it is XML-specific so not really (although Jackson has some compatibility support); JPA is DB-specific so I don't think so.

But the original description of the issue is relevant: first things first, output formatting basically does not exist wrt sections. It would be possible to add that with default logic, and if necessary, then consider other annotations.

cowtowncoder avatar Feb 27 '23 21:02 cowtowncoder