Proposal: Deprecate for removal the ‘External serializers’ feature with the aim of being replaced by the new 'Serializers DSL'
Why?
The external serializers feature is incomplete and has a lot of bugs. The leading causes for this are the fundamental technical limitations of the Kotlin compiler and overall design, such as:
- The compiler does not store default values for properties anywhere in the metadata. This means it is impossible to generate the code similar to
if (prop != propDefaultValue) encoder.encodeElement(prop)in the external serializer (https://github.com/Kotlin/kotlinx.serialization/issues/2512) - The compiler does not (always) store the fact that a property has a backing field. Since we aim to serialize properties with backing fields only and treat all others as transient, it can result in inconsistent serialization or descriptor information. (https://github.com/Kotlin/kotlinx.serialization/issues/2549)
- Generated serializers themselves are not customizable at all. You cannot specify SerialName for a property, nor specify a serializer for a specific property (in case a serializer for its type is also external, for example: https://github.com/Kotlin/kotlinx.serialization/issues/889)
- Java classes do not have a notion of a primary constructor. As a result, it is impossible to generate a serializer for a Java class at all, which limits the usability of this feature (https://github.com/Kotlin/kotlinx.serialization/issues/1223)
- Private or internal properties cannot be serialized correctly this way.
Replacement
Replacement solution requirements:
- The user should be able to specify the serial name, serializer, and default value for the property.
- The user should be able to specify which properties are serializable and which are not, as the compiler cannot deduce it for them.
- The user should be able to specify which constructor to call, as we cannot deduce it for them.
We can see that the ideal solution here is just to write a custom manual serializer. However, it requires significant expertise in kotlinx.serialization, and serializers themselves are boilerplate-ish, since Encoders API is designed mainly to be called from code generated by the compiler plugin. Perhaps AI can generate something reasonable, but we shouldn’t rely entirely on it.
A good example of avoiding writing a manual serializer is JsonTransformingSerializer, which is very popular and solves a lot of pain points. Providing a specialized solution for this particular problem also seems viable. Let’s name it Serializers DSL for now.
Draft can look like this:
class Example(val s: String, val cnt: Int = 0)
object Ex: AbstractSerializer<Example>() {
override val serializableProperties = properties(
Property("s", serializer = String.serializer()),
Property("counter", serializer = Int.serializer(), optional = true, defaultValue = 0)
)
override fun deconstruct(obj: Example) {
obj.s serializeAs properties["s"]
obj.cnt serializeAs properties["counter"]
}
override fun construct(data: Properties): Example {
return Example(data[properties["s"]], data[properties["counter"]])
}
}
Or like this:
class Example(val s: String, val cnt: Int = 0)
object Ex: AbstractSerializer<Example>() {
override val schema = properties(
Property("s", serializer = String.serializer()) { it.s },
Property("counter", serializer = Int.serializer(), optional = true, defaultValue = 0) { it.cnt}
).constructor { p1, p2 -> Example(p1, p2) }
}
It is also possible to employ a partial generation of code by the plugin:
class Example(val s: String, val cnt: Int = 0) // 3rd party module
abstract class AbstractSerializer<T>: KSerializer<T> {
override val descriptor get() = shema.toDescriptor()
abstract val schema: Properties
}
@ExternalSerializer
object Ex: AbstractSerializer<Example>() {
override val schema = properties(
Property("s", serializer = String.serializer(), Example::s),
Property("counter", serializer = Int.serializer(), optional = true, defaultValue = 0, Example::cnt),
Creator(::Example)
)
// generated by plugin:
fun serialize(e: Example, enc: Encoder) {
enc.encode(schema[0], e.s)
encoder.encode(schema[1], e.cnt)
}
}
Concerns
-
Serializers for non-standard classes (enum, sealed) should also be available to users. This means that we also have to design APIs for them (currently lacking, such serializers are implementation details).
-
Such DSL with lambdas is slower by definition, less performant than its manually written counterpart. Users may blame our design for this. Recommendation for performance problems will be “replace with manually written serializer via Encoders API”
Curious: could the alternative to the Serializers DSL be to just use the intermediate model?
(I will use example from docs as base) Original:
// NOT @Serializable
class Project(val name: String, val language: String)
@OptIn(ExperimentalSerializationApi::class)
@Serializer(forClass = Project::class)
object ProjectSerializer
Without any additional APIs, it could be replaced with:
// NOT @Serializable
class Project(val name: String, val language: String)
@SerialName("example.exampleSerializer23.Project") // to match the original package
@Serializable
private class Project2(val name: String, val language: String)
object ProjectSerializer : KSerializer<Project> {
private val delegate get() = Project2.serializer()
override val descriptor: SerialDescriptor get() = delegate.descriptor
override fun serialize(encoder: Encoder, value: Project) {
delegate.serialize(
encoder = encoder,
value = Project2(
name = value.name,
language = value.language
)
)
}
override fun deserialize(decoder: Decoder): Project {
return delegate.deserialize(decoder).let {
Project(
name = it.name,
language = it.language
)
}
}
}
But, by adding just one function, which allows us to map from one serializer to another, it could be simplified even more:
// new API
fun <I, T> KSerializer<I>.mapped(
convertForEncoding: (T) -> I,
convertForDecoding: (I) -> T
): KSerializer<T> = TODO()
// usage
@SerialName("example.exampleSerializer23.Project") // to match the original package
@Serializable
private class Project2(val name: String, val language: String)
object ProjectSerializer : KSerializer<Project> by Project2.serializer().mapped(
convertForEncoding = { Project2(it.name, it.language) },
convertForDecoding = { Project(it.name, it.language) }
)
Also, this new mapped API (named TBD) might be useful in other cases? WDYT?
I don't know if it will be able to be a replacement for all cases, but it feels like a rather small and powerful addition :)
@whyoleg What you're mentioning here is Surrogate serializer, which is also a viable solution for this problem. I agree that mapping serializer/deserializer (#2795) may streamline the process here.
Overall I think this is a good idea. I'm not sure that I like the DSL syntax (it feels rather clunky) and perhaps plugin support for surrogates/intermediates would be better (and more optimized). For DSLs it may be worth considering the option (for some platforms) to generate code at runtime and then load it, but such optimization could/would be internal to the library.
Another +1 for using streamlined delegation for this. It looks like Java is at least thinking about using the surrogate/delegate serializer approach (ref) for marshaling/unmarshaling. Their SchemaRecord seems like it could be fairly easily adapted to this library. For example:
@Serializable
data class ColorDelegate(val rgb: Int) : KSerializerDelegate<Color> {
override fun toDelegator(): Color = ...
}
// the compiler plugin enforces Color extends DelegatesKSerializer<T> where T matches the annotation,
// and that T extends KSerializerDelegate<Self>
@Serializable(delegate = ColorDelegate::class)
data class Color(val r: Int, val g: Int, val b: Int) : DelegatesKSerializer<ColorDelegate>() {
override fun toDelegate(): ColorDelegate = ...
}
The compiler plugin could even try to inline the conversion+serialization to avoid actually needing to allocate the delegate object, although hopefully value classes will solve that.
Indeed the Java approach with SchemaRecords fits the delegate approach. And while it is easy to write by hand it is better if done automatically.