Compilation fails with MethodTooLargeException on large sealed class
Describe the bug
When attempting to compile a @Serializable sealed class with ~2,000 subclasses compilation fails with a MethodTooLargeException. This appears to be related to the way the default SealedClassSerializer is generated for the base class.
exception: caused by: org.jetbrains.org.objectweb.asm.MethodTooLargeException: Method too large: org/example/LargeSealedClassExample._init_$_anonymous_ ()Lkotlinx/serialization/KSerializer;
exception: at org.jetbrains.org.objectweb.asm.MethodWriter.computeMethodInfoSize(MethodWriter.java:2088)
exception: at org.jetbrains.org.objectweb.asm.ClassWriter.toByteArray(ClassWriter.java:512)
exception: at org.jetbrains.kotlin.codegen.ClassBuilderFactories$1.asBytes(ClassBuilderFactories.java:83)
exception: at org.jetbrains.kotlin.codegen.DelegatingClassBuilderFactory.asBytes(DelegatingClassBuilderFactory.kt:27)
exception: at org.jetbrains.kotlin.codegen.DelegatingClassBuilderFactory.asBytes(DelegatingClassBuilderFactory.kt:27)
exception: at org.jetbrains.kotlin.codegen.ClassFileFactory$ClassBuilderAndSourceFileList.asBytes(ClassFileFactory.java:317)
exception: at org.jetbrains.kotlin.codegen.ClassFileFactory$OutputClassFile.asByteArray(ClassFileFactory.java:281)
exception: ... 29 more
To Reproduce
I've created a minimal sample project where this can be reproduced here https://github.com/vrendina/large-sealed-class-serialization
Expected behavior
If the @Serializble annotation is removed from the base class there is no issue compiling with 10,000+ subclasses, so the problem does seem to be related to how the methods are constructed for the serializer.
Environment
- Kotlin version: [e.g. 2.1.20]
- Library version: [e.g. 1.8.1]
- Kotlin platforms: JVM
- Gradle version: [8.10]
@vrendina Serialization must initialize the actual list of serializers. This happens by calling the constructor with the children as parameter. Each child is initialised (and added to the list) in initializer code. The code that generates this assumes that actual method length limits (in the class file format) are not hit, but it doesn't surprise me that this isn't checked.
@pdvrieze yeah I can see that in the arrays that are passed to the SealedClassSerializer. The compiler doesn't really like having a large array passed into the constructor. For the purposes of experimentation I created a local snapshot build with an open SealedClassSerializer and a non-internal constructor for the AbstractPolymorphicSerializer.
If you try to construct a SealedClassSerializer manually like below compilation will fail, I'm assuming this is close to what is generated by the compiler plugin:
object LargeSealedClassSerializer: SealedClassSerializer<LargeSealedClassExample>(
serialName = LargeSealedClassExample::class.qualifiedName!!,
baseClass = LargeSealedClassExample::class,
subclasses = subclasses,
subclassSerializers = subclassSerializers
)
val subclasses: Array<KClass<out LargeSealedClassExample>> = arrayOf(
LargeSealedClassExample.LargeSealedClassExample1::class,
...
LargeSealedClassExample.LargeSealedClassExample2000::class,
)
val subclassSerializers: Array<KSerializer<out LargeSealedClassExample>> = arrayOf(
LargeSealedClassExample.LargeSealedClassExample1.serializer(),
...
LargeSealedClassExample.LargeSealedClassExample2000.serializer(),
)
I then tried to create an AbstractPolymorphicSerializer with maps to lookup the serializers and this was successful.
object LargeSealedClassSerializer: AbstractPolymorphicSerializer<LargeSealedClassExample>() {
override val baseClass: KClass<LargeSealedClassExample> = LargeSealedClassExample::class
@OptIn(InternalSerializationApi::class, ExperimentalSerializationApi::class)
override val descriptor: SerialDescriptor
get() = buildSerialDescriptor(
serialName = LargeSealedClassExample::class.qualifiedName!!,
kind = PolymorphicKind.SEALED
) {
element("type", String.serializer().descriptor)
buildSerialDescriptor("kotlinx.serialization.Sealed<${baseClass.simpleName}>", SerialKind.CONTEXTUAL)
}
@InternalSerializationApi
override fun findPolymorphicSerializerOrNull(
decoder: CompositeDecoder,
klassName: String?
): DeserializationStrategy<LargeSealedClassExample>? {
return klassName?.let {
serialName2Serializer[it] as? KSerializer<LargeSealedClassExample>
}
}
@InternalSerializationApi
override fun findPolymorphicSerializerOrNull(
encoder: Encoder,
value: LargeSealedClassExample
): SerializationStrategy<LargeSealedClassExample>? {
return class2Serializer[value::class] as? KSerializer<LargeSealedClassExample>
}
}
private val class2Serializer = HashMap<KClass<out LargeSealedClassExample>, KSerializer<out LargeSealedClassExample>>().apply {
put(LargeSealedClassExample.LargeSealedClassExample1::class, LargeSealedClassExample.LargeSealedClassExample1.serializer())
...
put(LargeSealedClassExample.LargeSealedClassExample2000::class, LargeSealedClassExample.LargeSealedClassExample2000.serializer())
}
private val serialName2Serializer = HashMap<String, KSerializer<out LargeSealedClassExample>>().apply {
put("LargeSealedClassExample1", LargeSealedClassExample.LargeSealedClassExample1.serializer())
...
put("LargeSealedClassExample2000", LargeSealedClassExample.LargeSealedClassExample2000.serializer())
}
Creating those arrays doesn't look like code, but it is actually code in the constructor which has a code size limit. I suspect that your map code is somewhat shorter and can handle a bigger amount.
Creating those arrays doesn't look like code, but it is actually code in the constructor which has a code size limit. I suspect that your map code is somewhat shorter and can handle a bigger amount.
Yeah you are right it looks like if I push the map code up to 5000 entries it fails. The only other thing I can think of is to fall back to a reflection based serializer which is unfortunate because it is about 50x slower than the compiler plugin generated serializer.
object LargeSealedClassSerializer: AbstractPolymorphicSerializer<LargeSealedClassExample>() {
val class2Serializer = HashMap<KClass<out LargeSealedClassExample>, KSerializer<LargeSealedClassExample>>()
val serialName2Serializer = HashMap<String, KSerializer<LargeSealedClassExample>>()
override val baseClass: KClass<LargeSealedClassExample> = LargeSealedClassExample::class
@OptIn(InternalSerializationApi::class, ExperimentalSerializationApi::class)
override val descriptor: SerialDescriptor
get() = buildSerialDescriptor(
serialName = LargeSealedClassExample::class.qualifiedName!!,
kind = PolymorphicKind.SEALED
) {
element("type", String.serializer().descriptor)
buildSerialDescriptor("kotlinx.serialization.Sealed<${baseClass.simpleName}>", SerialKind.CONTEXTUAL)
}
@InternalSerializationApi
override fun findPolymorphicSerializerOrNull(
decoder: CompositeDecoder,
klassName: String?
): DeserializationStrategy<LargeSealedClassExample>? {
return serialName2Serializer.getOrPut(klassName!!) {
@Suppress("UNCHECKED_CAST")
kotlinx.serialization.serializer(
Class.forName("${baseClass.qualifiedName}$$klassName").kotlin, emptyList(), false
) as KSerializer<LargeSealedClassExample>
}
}
@InternalSerializationApi
override fun findPolymorphicSerializerOrNull(
encoder: Encoder,
value: LargeSealedClassExample
): SerializationStrategy<LargeSealedClassExample>? {
val klass = value::class
return class2Serializer.getOrPut(klass) {
@Suppress("UNCHECKED_CAST")
kotlinx.serialization.serializer(klass, emptyList(), false) as KSerializer<LargeSealedClassExample>
}
}
}
@vrendina I've had a look at the JVM class file format. Its maximum method size is 2^32, if that is not sufficient you have other problems. But unfortunately it seems that Asm limits the code size to 2^32. A workaround would be to have multiple delegate methods that initialise some of the children.
Whatever way you solve it, it might be easiest to use a custom serializer instead of the generated one.
I think 'method too large' error comes from the lambda in the lazy {} argument which caches serializers arrays in generated code. Perhaps we can turn caching off in this case, if it would help. cc @shanshin
@sandwwraith If not in lazy it needs to come from somewhere, the only "solution" would be to generate the array content dynamically (probably involving some sort of reflection - at least for JVM versions without sealed types)
You can shard the content into additional functions after so many elements. That way only these extreme cases pay for it.
You can shard the content into additional functions after so many elements. That way only these extreme cases pay for it.
I was just going to reference this wire PR https://github.com/square/wire/pull/3214 -- seems like a similar case. Thanks @JakeWharton
As a quick PoC of @JakeWharton's suggestion I tried creating a serializer for a 5,000 item sealed class by chunking the serializer lookup map creation in 1,000 item functions and that appears to be working. See here
object LargeSealedClassSerializer : AbstractPolymorphicSerializer<LargeSealedClassExample>() {
val class2Serializer by lazy(mode = LazyThreadSafetyMode.PUBLICATION) {
buildClass2Serializer()
}
val serialName2Serializer by lazy(mode = LazyThreadSafetyMode.PUBLICATION) {
buildSerialName2Serializer()
}
...
@InternalSerializationApi
override fun findPolymorphicSerializerOrNull(
decoder: CompositeDecoder,
klassName: String?
): DeserializationStrategy<LargeSealedClassExample>? {
return serialName2Serializer[klassName!!] as? KSerializer<LargeSealedClassExample>
}
@InternalSerializationApi
override fun findPolymorphicSerializerOrNull(
encoder: Encoder,
value: LargeSealedClassExample
): SerializationStrategy<LargeSealedClassExample>? {
return class2Serializer[value::class] as? KSerializer<LargeSealedClassExample>
}
private fun buildClass2Serializer(): HashMap<KClass<out LargeSealedClassExample>, KSerializer<out LargeSealedClassExample>> {
val map = HashMap<KClass<out LargeSealedClassExample>, KSerializer<out LargeSealedClassExample>>()
buildClass2Serializer0(map)
buildClass2Serializer1(map)
buildClass2Serializer2(map)
buildClass2Serializer3(map)
buildClass2Serializer4(map)
return map
}
private fun buildSerialName2Serializer(): HashMap<String, KSerializer<out LargeSealedClassExample>> {
val map = HashMap<String, KSerializer<out LargeSealedClassExample>>()
buildSerialName2Serializer0(map)
buildSerialName2Serializer1(map)
buildSerialName2Serializer2(map)
buildSerialName2Serializer3(map)
buildSerialName2Serializer4(map)
return map
}
}