apicurio-registry icon indicating copy to clipboard operation
apicurio-registry copied to clipboard

Error Timeout trying to lock table with apicurio-registry-mem and Confluent serializer

Open loicmathieu opened this issue 3 years ago • 10 comments
trafficstars

Hi, I'm using the docker container quay.io/apicurio/apicurio-registry-mem:2.2.5.Final in my integration tests with Confluent serializers. It sometimes, but not always, fails with a Timeout trying to lock table exception.

Complete stacktrace :

 org.apache.kafka.common.errors.SerializationException: Error registering Avro schema{"type":"record","name":"Trace","namespace":"com.decathlon.vct.vcstream.avro.trace","doc":"All terminal components (sender, listener, anomaly) will send this trace object to the tracing system with the exchange information","fields":[{"name":"component","type":[{"type":"string","avro.java.string":"String"}],"doc":"Component that initiated the Trace","default":"missing-component"},{"name":"type","type":{"type":"string","avro.java.string":"String"},"doc":"Data type of the message","default":"missing-trace-type"},{"name":"direction","type":{"type":"enum","name":"Direction","symbols":["INBOUND","OUTBOUND"]},"doc":"Inbound or Outbound"},{"name":"startTime","type":["null","long"],"default":null},{"name":"uuid","type":{"type":"string","avro.java.string":"String"},"doc":"UUID of the message","default":"missing-trace-uuid"},{"name":"dispatchCriteria","type":["null",{"type":"string","avro.java.string":"String"}],"doc":"Dispatch criteria of the message","default":null},{"name":"reason","type":["null",{"type":"string","avro.java.string":"String"}],"doc":"Deprecated - Remove it break FORWARD Compatibility","default":null},{"name":"duration","type":["null","long"],"default":null},{"name":"endTime","type":["null","long"],"default":null},{"name":"batchId","type":["null",{"type":"string","avro.java.string":"String"}],"doc":"Batch id of the message","default":null},{"name":"status","type":{"type":"enum","name":"TraceStatus","symbols":["OK","ERROR"]},"default":"ERROR"}],"version":5}
        at io.confluent.kafka.serializers.AbstractKafkaSchemaSerDe.toKafkaException(AbstractKafkaSchemaSerDe.java:259)
        at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:168)
        at io.confluent.kafka.serializers.KafkaAvroSerializer.serialize(KafkaAvroSerializer.java:61)
        at org.apache.kafka.common.serialization.Serializer.serialize(Serializer.java:62)
        at io.smallrye.reactive.messaging.kafka.fault.SerializerWrapper.lambda$serialize$1(SerializerWrapper.java:56)
        at io.smallrye.reactive.messaging.kafka.fault.SerializerWrapper.wrapSerialize(SerializerWrapper.java:81)
        at io.smallrye.reactive.messaging.kafka.fault.SerializerWrapper.serialize(SerializerWrapper.java:56)
        at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:945)
        at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:905)
        at io.smallrye.reactive.messaging.kafka.impl.ReactiveKafkaProducer.lambda$send$5(ReactiveKafkaProducer.java:127)
        at io.smallrye.context.impl.wrappers.SlowContextualConsumer.accept(SlowContextualConsumer.java:21)
        at io.smallrye.mutiny.operators.uni.builders.UniCreateWithEmitter.subscribe(UniCreateWithEmitter.java:22)
        at io.smallrye.mutiny.operators.AbstractUni.subscribe(AbstractUni.java:36)
        at io.smallrye.mutiny.operators.uni.UniRunSubscribeOn.lambda$subscribe$0(UniRunSubscribeOn.java:27)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: io.apicurio.registry.storage.impl.sql.jdb.RuntimeSqlException: org.h2.jdbc.JdbcSQLException: Timeout trying to lock table ; SQL statement:
INSERT INTO sequences (tenantId, name, value) VALUES (?, ?, ?) [50200-197]; error code: 0
        at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:301)
        at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:371)
        at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:548)
        at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:536)
        at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:494)
        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:274)
        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:381)
        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:354)
        at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:125)
        ... 15 more

loicmathieu avatar Aug 17 '22 12:08 loicmathieu

To be able to answer I would like to know more information, specifically about the lock mode you're using, the lock timeout etc, especially if you changed them to non-default values.

Thanks!

carlesarnal avatar Aug 25 '22 09:08 carlesarnal

@carlesarnal I just use the schema registry docker container with only confluent serializer/deserializer and the error pops up from time to time inside my test cases (which use testcontainers to start the container). I didn't specify anything about any lock.

loicmathieu avatar Aug 25 '22 15:08 loicmathieu

As you probably know, the default lock time is relatively low (1s). Do you mind trying to increase it and see what happens?

Thanks!

carlesarnal avatar Sep 07 '22 11:09 carlesarnal

@carlesarnal I have no idea there was such settings and also have no idea how to increase it via the docker image.

loicmathieu avatar Sep 07 '22 12:09 loicmathieu

By default Apicurio Registry sets the jdbc url using the env var REGISTRY_DATASOURCE_URL. The default value when the in-memory version is used is jdbc:h2:mem:registry_db, you can try setting that value to jdbc:h2:mem:registry_db;DEFAULT_LOCK_TIMEOUT=5000.

carlesarnal avatar Sep 08 '22 07:09 carlesarnal

Thanks @carlesarnal I'll try it tomorrow and let you know

loicmathieu avatar Sep 08 '22 08:09 loicmathieu

Hi @loicmathieu , do you mind sharing how to reproduce the issue on our side? Meanwhile, I find this comment possibly relevant: https://stackoverflow.com/a/4162685 We are using H2 version 1.4.199 so you can try to append also ;MVCC=true to the DATASOURCE_URL as @carlesarnal mentioned.

andreaTP avatar Sep 08 '22 10:09 andreaTP

@carlesarnal @andreaTP with both setting I was able to reproduce the issue. I'll try to provide a reproducer but as it's via a complex Quarkus test using automatic container (devservices) this may not be easy to reproduce.

loicmathieu avatar Sep 08 '22 14:09 loicmathieu

I have the same issue as you, @loicmathieu. I am using Quarkus with Confluent Avro serializers and the apicurio dev-services container. I only had this problem when using Quarkus' built-in Kafka integration that relies on Smallrye Reactive Messaging. When directly using a KafkaProducer which uses the Apicurio container to write to Kafka, I don't have any issues.

Are you using Smallrye to write to Kafka?

donare avatar Sep 12 '22 21:09 donare

@donare yes, the issue arise when using Quarkus Smallrye Reactive Messaging Kafka extension.

loicmathieu avatar Sep 13 '22 07:09 loicmathieu

One bit of news that is possibly unrelated. We have improved the in-memory version of registry that is used as part of dev mode. We are now creating a native version. This should improve startup time drastically, but I'm not sure if it will have any impact on this issue.

EricWittmann avatar Nov 11 '22 18:11 EricWittmann

Agreed, we're not only creating a native version now, but also the h2 version has been significantly updated (from 1.4.x to 2.x) so worth trying. @loicmathieu do you mind trying on your side?

Thanks!

carlesarnal avatar Nov 14 '22 15:11 carlesarnal