json-kotlin-schema-codegen
json-kotlin-schema-codegen copied to clipboard
Support for const (and/or single-value enums) in polymorphism
Thanks for sharing this project!
Would it be possible to support const
or single-value enum
as just pre-set value without constructor parameter?
Usecase: We're thinking about using this in our project, but we need to figure out how to support polymorphism in at least semi-intuitive way. I'm thinking the support of const
would make the first step.
E.g., this jsonschema:
title: Something Changed Event
allOf:
- $ref: "Event.yaml"
- $ref: "SomethingFields.yaml"
properties:
type:
const: "SomethingChangedEvent"
default: "SomethingChangedEvent"
Along with this referenced Event.yaml
parent:
title: Event
type: object
required:
- type
properties:
type:
type: string
Currently generate this child:
class SomethingChangedEvent(
type: String = "SomethingChangedEvent",
val other: Other
) : Event(type)
But this would be more error-proof if const
would result in this child being generated:
class SomethingChangedEvent(
val other: Other
) : Event(type = "SomethingChangedEvent")
Regarding the usecase, one possible next step for us would be to use Jackson2 mixins to also (manually) add polymorphic deserialization support based on this.
I'm not sure how this should work without polymorphism involved, though - perhaps type
in this case could be just val
field that is not present in constructor parameters? Is this reasonable?
After playing around with it a bit more, I found out that if there's no parent or if the intended parent is not the first in "allOf" list, then const is respected by generating the following code in data class
:
init {
require(type == cg_str0) { "type not constant value $cg_str0 - $type" }
}
If I understand it correctly, it could be useful for allowing deserialization framework to instantiate this object by supplying the value as the constructor argument. This would fail if there was no such constructor argument, even though there is only one possible valid value. If that's the reasoning, I understand the limitation, although I don't understand why this limitation is not applied in case of polymorphism too.
Still, perhaps it would be possible to have an extra constructor without this field? "For humans"? Or at least reorder fields based on whether they have defaults, so that the fields with defaults are at the end of the constructor arguments?
Hi - thanks for these suggestions. On the first point, using the const
value as a parameter to the base constructor from the derived class - I like the idea, but I took the approach of creating a field (and a constructor parameter) for each property. Your suggestion would mean that the constructor for the derived class would not have a parameter for the constant value.
As you pointed out in your later comment, when you use const
in a simple case with no inheritance it adds an init
check on the value. That is what I had planned - the property is present, but the const
forces a check which means that the constructor will fail if the wrong value is used. The fact that it doesn't output the check when the class is a derived class is a bug, and I am looking into it now.
I don't like your idea of re-ordering the fields - I think most people would expect constructor parameters to be in the same order as in the schema. Using Kotlin named parameters it's easy to include just the parameters you need.
On your second point, support for polymorphism - this is tricky because it depends on the serialization / deserialization library being used. I made an early design decision that the generated code should not contain any dependencies on other libraries, and I like to think that the code can be used by any of the JSON libraries.
Here I must declare a bias - I have my own serialization / deserialization library named kjson, and while I know that other libraries have a strong presence (you mentioned Jackson in one of your comments), I think that mine has real advantages. It is native Kotlin, rather than a Kotlin facade on Java, it supports Kotlin classes like Pair
and Triple
, and it includes non-blocking output, for use in a coroutine environment.
It includes polymorphic deserialization - see the section on fromJSONPolymorphic. In my day job, I use this functionality to deserialize CloudEvents, using the type
field in the CloudEvents envelope to determine which target class to use.
Of course, none of this is automated. I think it would be difficult to generate classes that could be deserialized automatically in a polymorphic manner, and any inclusion of library-dependent code would violate the principle I mentioned earlier of not adding dependencies on other libraries. But I'm interested to know how others are tackling the problem, and if a more automated means of deserializing members of a class hierarchy can be found, that could be a valuable addition to the library.
Thanks for a very informative reply!
Regarding kjson library, I think we could take a look into using it, but ultimately one of the reasons we're using Jackson is that it's "tried and true", so we have a lot of experience with using it, how to handle which problems, how it integrates with Spring Framework, what performance we can expect from it, etc. So switching is currently somewhat unlikely.
How would you recommend defining jsonschema for support of polymorphism? The way I was trying to do it with "type" property and "const" values in children?
Regarding changing order of properties. I found a somewhat interesting workaround. If I only mark "type" property as required in parent, but don't actually define it, then I can define it as the last property in each child and it works surprisingly well. Perhaps we could be satisfied with this approach for one more reason: if the parent doesn't have any properties at all, then children are generated as native data class
, which is arguably even better. I just don't know if this will be supported by Jackson or any other deserialization framework.
Regarding polymorphism using const
. I also tried another approach, which could theoretically provide enough information for a deserialization framework to know which types are possible to deserialize to. Perhaps there would be enough information to, e.g., override some templates to add Jackson's annotations. Here it is:
$schema: https://json-schema.org/draft-07/schema
type: object
required:
- type
properties:
type:
type: string
oneOf:
- $ref: "OtherFields.yaml"
properties:
type:
const: SomethingCreatedEvent
default: SomethingCreatedEvent
- $ref: "OtherFields.yaml"
properties:
type:
const: SomethingUpdatedEvent
default: SomethingUpdatedEvent
Unfortunately it generates nested classes Event.A
, Event.B
, Event.C
, etc. and I cannot find any way of at least changing the name of those classes, so it's a no-go.
In regard to the question of the recommended way of defining classes that are to be deserialized polymorphically - yes, I support the idea of defining a a "type" field in the base schema and redefining the same field as const
with a specific value for each sub-schema. How you make use of that in your deserialization library will be specific that that library, and it's difficult to see how it could be automated. I am considering adding a configuration option to allow annotations to be added to classes or fields, but that is probably some way in the future.
Generating code for a oneOf
presents some difficult challenges. As you have found, if there are no shared fields, the generator can create classes using the name from the $ref
, but when the class needs to combine properties from outside the oneOf
with those inside, it has to generate a new name ("A", "B" etc.). I am considering adding to the configuration option which currently allows you to choose an output class name (see classNames
), extending it to allow control over the choice of nested class names as well. That would require some manual configuration on your part, but it would be much simpler to implement than any automated mechanism for deriving the names.
I am considering adding a configuration option to allow annotations to be added to classes or fields, but that is probably some way in the future.
Sounds good! Specifically for Jackson's subtype annotations we would probably be OK even with just some empty templates that we could then override ourselves. So that, provided the context contains all the info, we would just customize these ourselves.
when the class needs to combine properties from outside the
oneOf
with those inside
I couldn't get it to not generate a name. I tried:
- referencing a different file, but then that file is also generated (without parent), so it's understandable that you can't have two types with the same name - one with parent and one without
- putting the contents of the other file directly in
oneOf
without using$ref
, but then it seems to have no info on how to name it, since there's no file (and it doesn't take title into account)
Logically, I can't think of a clean way of doing this with oneOf
. If you have parent in one "parent" schema file that contains oneOf
that references other "child" schema files, then the child files are generated without parent, since they don't mention the parent - they are just embeddables basically. In fact, they could be mentioned from multiple such parents! So in order to actually reuse (reference) those child schemas, the parent would actually need to be generated with code that contains some kind of nested children that are just wrappers for the actual children, then each wrapper would contain the actual embeddable child. That doesn't seem very clean, since it introduces an extra structure level not present in the schema itself. But at least those child wrappers can be called the same as the embeddable they contain, since child wrappers are nested classes in the parent. And serialization frameworks would know how to (de)serialize each of them (use type if wrapper, don't use type if embeddable).
Anyway, thanks for your time, I guess we'll try to use what is available for now. Should I close this issue or do you want to keep it for that missing const validation bug?
Keep it open - I have worked out a fix for that missing validation bug, and I should be able to get a new version uploaded to Maven Central within 24 hours. I'll notify you here.
Regarding oneOf
- if the parent schema contains no properties of its own, just the oneOf
with members consisting only of a single $ref
, it will output the parent as an interface
and use the name derived from the $ref
for the child classes. That can be useful, but it doesn't guarantee that a type discriminator property exists in the base, which is what I think you're trying to achieve.
I think we're both hitting against the mismatch between the JSON Schema specification and the requirements of polymorphism. In any case, this has been a useful discussion, and I hope you're able to make productive use of the library.
I couldn't make it generate interface
, but after your comment I tried a bit more thoroughly and it turns out it's not enough that there are no fields, I also need to remove type: object
for it to generate interface
. But although it generated interface
, it wasn't used in child classes at all, when those classes have allOf
in them. It works if I use just one $ref
in the child directly (and there's no parent class), but no longer works once I wrap it in allOf
(and a parent class is generated). Is this a bug? Also it's a bit strange that direct $ref
actually uses the fields from the $ref
, but doesn't use it as a parent. (Although this strange behavior is actually OK for us currently, I think.)
Parent:
$schema: https://json-schema.org/draft-07/schema
title: Event
description: Generic parent for all events
oneOf:
- $ref: SomethingChangedEvent.yaml
- $ref: SomethingUpdatedEvent.yaml
Child (has parent, doesn't implement interface):
$schema: https://json-schema.org/draft-07/schema
title: Something Changed Event
description: Event when Something changed
allOf:
- $ref: "Something.yaml"
properties:
type:
type: string
const: SomethingChangedEvent
default: SomethingChangedEvent
Child (no parent, implements interface):
$schema: https://json-schema.org/draft-07/schema
title: Something Changed Event
description: Event when Something changed
$ref: "Something.yaml"
properties:
type:
type: string
const: SomethingChangedEvent
default: SomethingChangedEvent
First of all, the new version of the project with the bug fix - version 0.78 - has been uploaded to Maven Central. It took longer than I expected because I fixed a number of similar issues related to validations not being applied to overriding property definitions at the same time.
I'm going to have to investigate further on the cases you have listed above - thanks for doing this investigative work. I'll let you know what I find.
You expressed an interest in the idea of adding annotations to the generated code – you may be interested to know that annotations are now available in version 0.82 of the code generator. The configuration is described in the Configuration Guide.
I hope this helps with your uses of the code generator.
Thanks for the note. I'll definitely try it out when I get back to the original initiative.
Btw, I did end up using your kjson lib already for a small bit of functionality due to JsonPointer support. Although with kotlinx.serialization being released, it's possible that us and others will slowly migrate to that one in the future.