kotlinx.serialization
kotlinx.serialization copied to clipboard
Support Proto oneOf with new annotations and polymophic serializer
I commented in the issue #67 but I think it is necessary to open a new issue for further discussion.
I am writing a code-gen plugin to build kotlin data class for our proto file. As for oneOf case, a sealed interface may be an ideal solution.
Let's say I have a proto message like
message Person {
string name = 1;
oneof phone {
string mobile = 2;
string home = 3;
}
}
My ideal data class will be like
data class Person(
val name: String,
val phone: IPhoneType,
)
sealed interface IPhoneType
data class MobilePhone(val value: String): IPhoneType
data class HomePhone(val value: String): IPhoneType
So I need to tell the ProtoBuf Decoder that if it comes with ProtoNum 2 or 3, deserialize it as IPhoneType and assign to the phone field.
A custom serializer for the whole Person class can work, but it would be nice to have some additional annotation supports. Like:
data class Person(
@ProtoNum(1) val name: String,
@ProtoOneOfFields(2, 3) val phone: IPhoneType,
)
sealed interface IPhoneType
@ProtoOneOfNum(2)
data class MobilePhone(val value: String): IPhoneType
@ProtoOneOfNum(3)
data class HomePhone(val value: String): IPhoneType
@ProtoOneOfFields tells that this field may be assined by the following ProtoNums, and the @ProtoOneOfNum on the concrete class tells which ProtoNum can be parsed to this type.
Generally, this case could be a flatten key for common serializer ( not ProtoBuf only ).
For example, encoding an instance of Person defined like
@Serializable
data class Person(
val name: String,
@Flatten val address: Address,
)
@Serializable
data class Address(val city: String, val street: String)
is especting content (show in json) like:
{
"name": "Jhon",
"city": "London",
"street": "1st Avenue"
}
And then, @ProtoOneOfFields can be a subtype of @Flatten, with more flexiable funcitons.
Yes, that's one of the ways to do it. Although I'd say that @Flatten is a different scenario.
@ProtoOneOfFields may be misleading as there is no way for the format to actually resolve the candidates outside the normal polymorphic enumeration system. As such this annotation could only serve to restrict the candidates (which may be a valid feature, but should be named such that that is clear).
Personally I would go with @ProtoOneOf on the field (or a configuration option to make that the default in case that all candidates have numbers). Then I'd make @ProtoNum apply to types (rather than just fields) - I don't think a differently named annotation is needed here.
As an implementation note, it is a bit "clunky" to extract all polymorphic descendants from a SerialModule, but it is possible.
It is true that oneOf cannot just work with current polymorphic system. This implementation is also to find a way to make them work together.
Just change a bit according to @pdvrieze suggested.
data class Person(
@ProtoNum(1) val name: String,
@ProtoOneOf(2, 3) val phone: IPhoneType,
)
sealed interface IPhoneType
@ProtoNum(2) data class MobilePhone(val value: String): IPhoneType
@ProtoNum(3) data class HomePhone(val value: String): IPhoneType
The first thing is, when the serializer goes to val phone: IPhoneType, it should not treat the property as a serailizable with calling beginStructure, but read/write the inner value property of the concreate type directly instead.
That is why I think a Flatten annotation is necessary here.
As for the polymorphic handling, the ProtoNum is a perfect discriminator to find serializer of sealed interface, without any extra content needed. But it seems that such extra content is required by AbstractPolymorphicSerializer, so the ProtoBuf format may provide another PolymorphicSerializer implementation.
And such new polymophic serializer is helpful if someone wants to hold proto enum data in kotlin sealed interface, rather than kotlin enum class.
For implementation the format can recognise the polymorphic serializer and do its own thing. No need for another serializer that tells it how to. Alternatively you can have the format synthesise the data expected by the polymorphic serializer. The xml format does this for example.
The way to implement it is as follows for the decoding
- The regular
CompositeDecoderdetermines which of its children is polymorphic (withOneOfhandling). - For each child the decoder determines all potential polymorphic children (caching makes sense here) either because the type is sealed or from the serializerModule. It
- When a child is read that is a "oneOf" child it uses the mapping from the previous step to then create "special" transparent
CompositeDecoder(ie.OneOfDecoder : CompositeDecoder) that takes the expected type/serializer as one of its constructor parameters. - The OneOfDecoder would synthesize decoding in two parts. (
beginStructure/endStructurewould not be reading any actual data) The first is a string of the serialName for the type. The second is the actual deserialization of the value (only this involves reading the protobuf content).
Encoding works in parallel, but the OneOfEncoder rather has a property it can ignore it entirely. Then when the value is written as second member it uses the the protonum annotation on the type (from the passed in serialDescriptor) for writing (rather than the default number used for polymorphism).
I have exactly the same use case. Coincidentally, I just started to take a look to see how hard it was to implement it after checking the protobuf schema generated for a polymorphic class. Good to know I'm not the only one.
For context, this is the schema generated for my classes:
syntax = "proto2";
// serial name 'cash.atto.commons.AttoTransaction'
message AttoTransaction {
required AttoBlock block = 1;
...
}
// serial name 'cash.atto.commons.AttoBlock'
message AttoBlock {
required string type = 1;
// decoded as message with one of these types:
// message CHANGE, serial name 'CHANGE'
// message OPEN, serial name 'OPEN'
// message RECEIVE, serial name 'RECEIVE'
// message SEND, serial name 'SEND'
required bytes value = 2;
}
message CHANGE { // I'm also serializing this class to json, this is why I'm using @SerialName
...
}
message OPEN {
...
}
...
And this is what I was expecting:
syntax = "proto2";
// serial name 'cash.atto.commons.AttoTransaction'
message AttoTransaction {
oneof block {
AttoOpen open = 1;
AttoSend send = 2;
...
}
...
}
// serial name 'cash.atto.commons.AttoOpen'
message AttoOpen {
...
}
// serial name 'cash.atto.commons.AttoSend'
message AttoSend {
...
}
...
Sounds interesting @xiaozhikang0916
🎉 great news