json-kotlin-schema-codegen icon indicating copy to clipboard operation
json-kotlin-schema-codegen copied to clipboard

Support for const (and/or single-value enums) in polymorphism

Open koscejev opened this issue 2 years ago • 12 comments

Thanks for sharing this project!

Would it be possible to support const or single-value enum as just pre-set value without constructor parameter?

Usecase: We're thinking about using this in our project, but we need to figure out how to support polymorphism in at least semi-intuitive way. I'm thinking the support of const would make the first step.

E.g., this jsonschema:

title: Something Changed Event
allOf:
  - $ref: "Event.yaml"
  - $ref: "SomethingFields.yaml"
properties:
  type:
    const: "SomethingChangedEvent"
    default: "SomethingChangedEvent"

Along with this referenced Event.yaml parent:

title: Event
type: object
required:
  - type
properties:
  type:
    type: string

Currently generate this child:

class SomethingChangedEvent(
    type: String = "SomethingChangedEvent",
    val other: Other
) : Event(type)

But this would be more error-proof if const would result in this child being generated:

class SomethingChangedEvent(
    val other: Other
) : Event(type = "SomethingChangedEvent")

Regarding the usecase, one possible next step for us would be to use Jackson2 mixins to also (manually) add polymorphic deserialization support based on this.

I'm not sure how this should work without polymorphism involved, though - perhaps type in this case could be just val field that is not present in constructor parameters? Is this reasonable?

koscejev avatar Jun 07 '22 15:06 koscejev

After playing around with it a bit more, I found out that if there's no parent or if the intended parent is not the first in "allOf" list, then const is respected by generating the following code in data class:

init {
    require(type == cg_str0) { "type not constant value $cg_str0 - $type" }
}

If I understand it correctly, it could be useful for allowing deserialization framework to instantiate this object by supplying the value as the constructor argument. This would fail if there was no such constructor argument, even though there is only one possible valid value. If that's the reasoning, I understand the limitation, although I don't understand why this limitation is not applied in case of polymorphism too.

Still, perhaps it would be possible to have an extra constructor without this field? "For humans"? Or at least reorder fields based on whether they have defaults, so that the fields with defaults are at the end of the constructor arguments?

koscejev avatar Jun 08 '22 12:06 koscejev

Hi - thanks for these suggestions. On the first point, using the const value as a parameter to the base constructor from the derived class - I like the idea, but I took the approach of creating a field (and a constructor parameter) for each property. Your suggestion would mean that the constructor for the derived class would not have a parameter for the constant value.

As you pointed out in your later comment, when you use const in a simple case with no inheritance it adds an init check on the value. That is what I had planned - the property is present, but the const forces a check which means that the constructor will fail if the wrong value is used. The fact that it doesn't output the check when the class is a derived class is a bug, and I am looking into it now.

I don't like your idea of re-ordering the fields - I think most people would expect constructor parameters to be in the same order as in the schema. Using Kotlin named parameters it's easy to include just the parameters you need.

On your second point, support for polymorphism - this is tricky because it depends on the serialization / deserialization library being used. I made an early design decision that the generated code should not contain any dependencies on other libraries, and I like to think that the code can be used by any of the JSON libraries.

Here I must declare a bias - I have my own serialization / deserialization library named kjson, and while I know that other libraries have a strong presence (you mentioned Jackson in one of your comments), I think that mine has real advantages. It is native Kotlin, rather than a Kotlin facade on Java, it supports Kotlin classes like Pair and Triple, and it includes non-blocking output, for use in a coroutine environment.

It includes polymorphic deserialization - see the section on fromJSONPolymorphic. In my day job, I use this functionality to deserialize CloudEvents, using the type field in the CloudEvents envelope to determine which target class to use.

Of course, none of this is automated. I think it would be difficult to generate classes that could be deserialized automatically in a polymorphic manner, and any inclusion of library-dependent code would violate the principle I mentioned earlier of not adding dependencies on other libraries. But I'm interested to know how others are tackling the problem, and if a more automated means of deserializing members of a class hierarchy can be found, that could be a valuable addition to the library.

pwall567 avatar Jun 08 '22 13:06 pwall567

Thanks for a very informative reply!

Regarding kjson library, I think we could take a look into using it, but ultimately one of the reasons we're using Jackson is that it's "tried and true", so we have a lot of experience with using it, how to handle which problems, how it integrates with Spring Framework, what performance we can expect from it, etc. So switching is currently somewhat unlikely.

How would you recommend defining jsonschema for support of polymorphism? The way I was trying to do it with "type" property and "const" values in children?

koscejev avatar Jun 08 '22 14:06 koscejev

Regarding changing order of properties. I found a somewhat interesting workaround. If I only mark "type" property as required in parent, but don't actually define it, then I can define it as the last property in each child and it works surprisingly well. Perhaps we could be satisfied with this approach for one more reason: if the parent doesn't have any properties at all, then children are generated as native data class, which is arguably even better. I just don't know if this will be supported by Jackson or any other deserialization framework.

koscejev avatar Jun 08 '22 14:06 koscejev

Regarding polymorphism using const. I also tried another approach, which could theoretically provide enough information for a deserialization framework to know which types are possible to deserialize to. Perhaps there would be enough information to, e.g., override some templates to add Jackson's annotations. Here it is:

$schema: https://json-schema.org/draft-07/schema
type: object
required:
  - type
properties:
  type:
    type: string
oneOf:
  - $ref: "OtherFields.yaml"
    properties:
      type:
        const: SomethingCreatedEvent
        default: SomethingCreatedEvent
  - $ref: "OtherFields.yaml"
    properties:
      type:
        const: SomethingUpdatedEvent
        default: SomethingUpdatedEvent

Unfortunately it generates nested classes Event.A, Event.B, Event.C, etc. and I cannot find any way of at least changing the name of those classes, so it's a no-go.

koscejev avatar Jun 08 '22 14:06 koscejev

In regard to the question of the recommended way of defining classes that are to be deserialized polymorphically - yes, I support the idea of defining a a "type" field in the base schema and redefining the same field as const with a specific value for each sub-schema. How you make use of that in your deserialization library will be specific that that library, and it's difficult to see how it could be automated. I am considering adding a configuration option to allow annotations to be added to classes or fields, but that is probably some way in the future.

Generating code for a oneOf presents some difficult challenges. As you have found, if there are no shared fields, the generator can create classes using the name from the $ref, but when the class needs to combine properties from outside the oneOf with those inside, it has to generate a new name ("A", "B" etc.). I am considering adding to the configuration option which currently allows you to choose an output class name (see classNames), extending it to allow control over the choice of nested class names as well. That would require some manual configuration on your part, but it would be much simpler to implement than any automated mechanism for deriving the names.

pwall567 avatar Jun 09 '22 11:06 pwall567

I am considering adding a configuration option to allow annotations to be added to classes or fields, but that is probably some way in the future.

Sounds good! Specifically for Jackson's subtype annotations we would probably be OK even with just some empty templates that we could then override ourselves. So that, provided the context contains all the info, we would just customize these ourselves.

when the class needs to combine properties from outside the oneOf with those inside

I couldn't get it to not generate a name. I tried:

  • referencing a different file, but then that file is also generated (without parent), so it's understandable that you can't have two types with the same name - one with parent and one without
  • putting the contents of the other file directly in oneOf without using $ref, but then it seems to have no info on how to name it, since there's no file (and it doesn't take title into account)

Logically, I can't think of a clean way of doing this with oneOf. If you have parent in one "parent" schema file that contains oneOf that references other "child" schema files, then the child files are generated without parent, since they don't mention the parent - they are just embeddables basically. In fact, they could be mentioned from multiple such parents! So in order to actually reuse (reference) those child schemas, the parent would actually need to be generated with code that contains some kind of nested children that are just wrappers for the actual children, then each wrapper would contain the actual embeddable child. That doesn't seem very clean, since it introduces an extra structure level not present in the schema itself. But at least those child wrappers can be called the same as the embeddable they contain, since child wrappers are nested classes in the parent. And serialization frameworks would know how to (de)serialize each of them (use type if wrapper, don't use type if embeddable).

Anyway, thanks for your time, I guess we'll try to use what is available for now. Should I close this issue or do you want to keep it for that missing const validation bug?

koscejev avatar Jun 10 '22 07:06 koscejev

Keep it open - I have worked out a fix for that missing validation bug, and I should be able to get a new version uploaded to Maven Central within 24 hours. I'll notify you here.

Regarding oneOf - if the parent schema contains no properties of its own, just the oneOf with members consisting only of a single $ref, it will output the parent as an interface and use the name derived from the $ref for the child classes. That can be useful, but it doesn't guarantee that a type discriminator property exists in the base, which is what I think you're trying to achieve.

I think we're both hitting against the mismatch between the JSON Schema specification and the requirements of polymorphism. In any case, this has been a useful discussion, and I hope you're able to make productive use of the library.

pwall567 avatar Jun 10 '22 08:06 pwall567

I couldn't make it generate interface, but after your comment I tried a bit more thoroughly and it turns out it's not enough that there are no fields, I also need to remove type: object for it to generate interface. But although it generated interface, it wasn't used in child classes at all, when those classes have allOf in them. It works if I use just one $ref in the child directly (and there's no parent class), but no longer works once I wrap it in allOf (and a parent class is generated). Is this a bug? Also it's a bit strange that direct $ref actually uses the fields from the $ref, but doesn't use it as a parent. (Although this strange behavior is actually OK for us currently, I think.)

Parent:

$schema: https://json-schema.org/draft-07/schema
title: Event
description: Generic parent for all events
oneOf:
  - $ref: SomethingChangedEvent.yaml
  - $ref: SomethingUpdatedEvent.yaml

Child (has parent, doesn't implement interface):

$schema: https://json-schema.org/draft-07/schema
title: Something Changed Event
description: Event when Something changed
allOf:
  - $ref: "Something.yaml"
properties:
  type:
    type: string
    const: SomethingChangedEvent
    default: SomethingChangedEvent

Child (no parent, implements interface):

$schema: https://json-schema.org/draft-07/schema
title: Something Changed Event
description: Event when Something changed
$ref: "Something.yaml"
properties:
  type:
    type: string
    const: SomethingChangedEvent
    default: SomethingChangedEvent

koscejev avatar Jun 13 '22 07:06 koscejev

First of all, the new version of the project with the bug fix - version 0.78 - has been uploaded to Maven Central. It took longer than I expected because I fixed a number of similar issues related to validations not being applied to overriding property definitions at the same time.

I'm going to have to investigate further on the cases you have listed above - thanks for doing this investigative work. I'll let you know what I find.

pwall567 avatar Jun 13 '22 14:06 pwall567

You expressed an interest in the idea of adding annotations to the generated code – you may be interested to know that annotations are now available in version 0.82 of the code generator. The configuration is described in the Configuration Guide.

I hope this helps with your uses of the code generator.

pwall567 avatar Jul 20 '22 14:07 pwall567

Thanks for the note. I'll definitely try it out when I get back to the original initiative.

Btw, I did end up using your kjson lib already for a small bit of functionality due to JsonPointer support. Although with kotlinx.serialization being released, it's possible that us and others will slowly migrate to that one in the future.

koscejev avatar Jul 21 '22 12:07 koscejev