kotlinx.serialization icon indicating copy to clipboard operation
kotlinx.serialization copied to clipboard

Describing recursive structures with `buildClassSerialDescriptor`

Open sugarmanz opened this issue 3 years ago • 5 comments

What is your use-case and why do you need this feature?

Describing a recursive structure as a SerialDescriptor is a pain. The docs say to use the buildClassSerialDescriptor for describing complex structures and for the most part, that is a pretty nice way to do it. Except, once you need to define a recursive element, you can't do it easily because it requires access to the descriptor that's currently being defined.

data class Nested(
    val flag: Boolean,
    val nested: Nested? = null,
)

object NestedSerializer : KSerializer<Nested> {
    override val descriptor = buildClassSerialDescriptor {
        element("flag", Boolean.serializer().descriptor)
        element("nested", Nested.serializer().descriptor, isOptional = true)
        // or
        element<Boolean>("flag")
        element<Nested?>("nested", isOptional = true)
    }
} 

This code is problematic due to the cyclical reference to the NestedSerializer and its descriptor. I was pretty confused that this seemingly wasn't already supported, but it lead me down the rabbit hole and I eventually found the private defer helper for wrapping a descriptor in a lazy delegate. And then the PR comment regarding this approach from nearly two years ago.

Although, I am finding it a little strange that I couldn't find any other tickets with this issue since that comment in nearly two years, so maybe I'm completely missing something.

Describe the solution you'd like

At the very least, I don't want to have to copy private helper code from the library I'm using in order to do something somewhat straightforward. I'm not sure why it's not part of the public API to begin with, but making it public or even InternalSerializationApi so that I can assume the risk instead of maintaining duplicate helper code would be an okay short term solution. If it's not public because it's not a good solution, then I think it warrants a deeper look anyways rather than consumers copying private helpers.

For a more involved approach, I'm not too keen on the details at the moment, but ideally, the code shouldn't look any different than defining any other element, as demonstrated in the code block above. Of the two approaches, the first enables the writer to explicitly pass the desired descriptor, or, maybe even the ClassSerialDescriptorBuilder itself.

buildClassSerialDescriptor {
    element("flag", Boolean.serializer().descriptor)
    element("nested", this, isOptional = true)
}

Though I do know that is potentially problematic because the builder doesn't contain all the information required of the descriptor, namely the typeParameters. But the builder could be changed to represent a more holistic representation of the descriptor and maybe a memoized fun build(): SerialDescriptor such that if you have a descriptor builder, calling build on it would produce the same instance of the descriptor as long as its properties remained unchanged.

Given that most of the examples use a form of implicit serializer lookup, the following would probably be desirable as well:

buildClassSerialDescriptor {
    element<Boolean>("flag")
    element<Nested?>("nested", isOptional = true)
}

In this case, the element extension function would need to short circuit if the type parameter was the same type as the descriptor is for, otherwise it'd lead to the same cyclical issue. Maybe the builder itself would need to store the KClass of the structure it is describing, such that it could be checked against in that function?

buildClassSerialDescriptor<Nested> {
    element<Boolean>("flag")
    element<Nested?>("nested", isOptional = true)
}

Anyways, let me know if I can help out in any way! More than happy to take a stab at this if deemed appropriate.

sugarmanz avatar Jan 04 '22 18:01 sugarmanz

I had a look at the implementation. There is a cludge that you could use which is you own implementation of SerialDescriptor that just delegates to the serializer. As a more sustainable solution, it should be possible to use this in the builder where needed. Btw. defer is only defined for the json format, not for the library overall. The solution would look like:

class IndirectSerialDescriptor(private val serializer: KSerializer<*>): SerialDescriptor by serializer.descriptor

pdvrieze avatar Jan 06 '22 10:01 pdvrieze

Hey @pdvrieze, thanks for taking a look.

The IndirectSerialDescriptor approach seems to work at first glance, but it actually fails because the descriptor has not yet been instantiated. It's interesting that the compiler allows this code, but if you try to print the descriptor, you'll get an NPE when it tries to read that element. You can verify this even quicker with:

class IndirectSerialDescriptor(private val serializer: KSerializer<*>): SerialDescriptor by serializer.descriptor!!

The IDE tells you the !! is redundant, but it'll end up throwing the NPE if used in a recursive fashion:

object NestedSerializer : KSerializer<Nested> {
    override val descriptor = buildClassSerialDescriptor {
        element("flag", Boolean.serializer().descriptor)
        element("nested", IndirectSerialDescriptor(Nested.serializer()), isOptional = true)
    }
} 

Even if it did work, I wouldn't really accept this as a scalable solution. Yes, it's really only one line of code for consumers to maintain, but there is still the cognitive burden for consumers to realize that they can't use the recommended approach for registering recursive structures. My first approach was to use:

element<Nested?>("nested", isOptional = true)

But the error isn't actually useful either:

Caused by: java.lang.NullPointerException
	at kotlinx.serialization.builtins.BuiltinSerializersKt.getNullable(BuiltinSerializers.kt:20)
	at kotlinx.serialization.SerializersKt__SerializersKt.nullable$SerializersKt__SerializersKt(Serializers.kt:181)
	at kotlinx.serialization.SerializersKt__SerializersKt.serializerByKTypeImpl$SerializersKt__SerializersKt(Serializers.kt:86)
	at kotlinx.serialization.SerializersKt__SerializersKt.serializer(Serializers.kt:59)
	at kotlinx.serialization.SerializersKt.serializer(Unknown Source)
	at kotlinx.serialization.SerializersKt__SerializersKt.serializer(Serializers.kt:41)
	at kotlinx.serialization.SerializersKt.serializer(Unknown Source)
	at Nested$Serializer$descriptor$1.invoke

My point is, defining recursive descriptors should be supported as a first class feature of the ClassSerialDescriptorBuilder.

defer is only defined for the json format, not for the library overall

The defer approach does work because it only evaluates the provider after the descriptor is able to be instantiated. But the fact that it's private and only for a specific format really only enhances my point. The GH comment that I posted above highlights that this was going to be an issue for consumers too. And even if it was exposed, I still would think that defer falls into the same argument above with the IndirectSerialDescriptor.

sugarmanz avatar Jan 06 '22 18:01 sugarmanz

Ah, that is because delegation has issues. You will have to delegate manually (store the serializer, not the descriptor). I forgot that the shorthand delegation uses a hidden property, not the actual code written after by

pdvrieze avatar Jan 06 '22 18:01 pdvrieze

Right, but this ends up being the same code as defer then:

private class IndirectSerialDescriptor(private val serializer: KSerializer<*>): SerialDescriptor {

    private val original: SerialDescriptor by lazy {
        serializer.descriptor
    }

    override val serialName: String
        get() = original.serialName
    override val kind: SerialKind
        get() = original.kind
    override val elementsCount: Int
        get() = original.elementsCount

    override fun getElementName(index: Int): String = original.getElementName(index)
    override fun getElementIndex(name: String): Int = original.getElementIndex(name)
    override fun getElementAnnotations(index: Int): List<Annotation> = original.getElementAnnotations(index)
    override fun getElementDescriptor(index: Int): SerialDescriptor = original.getElementDescriptor(index)
    override fun isElementOptional(index: Int): Boolean = original.isElementOptional(index)
}

Which leads me back to one of my original points:

At the very least, I shouldn't have to copy private helper code from the library I'm using in order to do something somewhat straightforward.

sugarmanz avatar Jan 06 '22 18:01 sugarmanz

I also recently faced this issue. Is there any update on this? Can I help somehow?

lukellmann avatar Apr 19 '23 12:04 lukellmann