Vertex Thread blocked errors
I have a relatively small size Strimzi Kafka setup for my development testing and getting this error in Strimzi Kafka Bridge (v 0.20.1 ) every day.
2022-01-06 11:38:39 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 37142 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
at org.apache.kafka.common.config.ConfigResource.<init>(ConfigResource.java:68) ~[kafka-clients-2.7.0.jar:?]
at io.vertx.kafka.client.common.impl.Helper.to(Helper.java:164) ~[vertx-kafka-client-4.1.0.jar:4.1.0]
at io.vertx.kafka.client.common.impl.Helper$$Lambda$400/0x00000008403c6440.apply(Unknown Source) ~[?:?]
My pod has restarted 37 times in the last 36 days, and I have only used it for a max of 2 or 3 days.
How can I debug this further? Is it possible to run Strimzi Bridge in debug mode?
Is it possible to detect these exceptions and pod restart events in Strimzi Prometheus/Grafana and create an alert?
Thanks
@ppatierno @sknot-rh Iny idea? ^^^
Can you paste the entire stack trace, I would like to know from where it's coming, while you are using it or even in idle conditions. I have never seen this.
I'm not seeing the daily error on idle usage anymore, maybe because I reinstalled the cluster on 7th Jan. I got an error message just now on my first usage.
2022-01-07 12:56:07 INFO Metadata:279 - [Producer clientId=producer-30] Cluster ID: TCh6nJ0aQTCWqhar7prJjg
2022-01-07 12:57:07 INFO KafkaProducer:1205 - [Producer clientId=producer-30] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.
2022-01-07 12:57:07 INFO Metrics:668 - Metrics scheduler closed
2022-01-07 12:57:07 INFO Metrics:672 - Closing reporter org.apache.kafka.common.metrics.JmxReporter
2022-01-07 12:57:07 INFO Metrics:678 - Metrics reporters closed
2022-01-07 12:57:07 INFO AppInfoParser:83 - App info kafka.producer for producer-30 unregistered
2022-01-07 12:57:07 INFO KafkaProducer:1205 - [Producer clientId=producer-29] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.
2022-01-07 12:57:07 INFO Metrics:668 - Metrics scheduler closed
2022-01-07 12:57:07 INFO Metrics:672 - Closing reporter org.apache.kafka.common.metrics.JmxReporter
2022-01-07 12:57:07 INFO Metrics:678 - Metrics reporters closed
2022-01-07 12:57:07 INFO AppInfoParser:83 - App info kafka.producer for producer-29 unregistered
2022-01-14 14:28:13 INFO send:85 - [609416115] SEND Request: from xxxxxxxxx:1656, method = POST, path = /topics/outbound-email-topic
2022-01-14 14:28:14 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 2579 ms, time limit is 2000 ms
2022-01-14 14:28:14 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 3872 ms, time limit is 2000 ms
2022-01-14 14:28:14 INFO ProducerConfig:361 - ProducerConfig values:
acks = 1
batch.size = 16384
bootstrap.servers = [my-cluster-kafka-bootstrap:9092]
buffer.memory = 33554432
client.dns.lookup = use_all_dns_ips
client.id = producer-31
compression.type = none
connections.max.idle.ms = 540000
delivery.timeout.ms = 120000
enable.idempotence = false
interceptor.classes = []
internal.auto.downgrade.txn.commit = false
key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
linger.ms = 0
max.block.ms = 60000
max.in.flight.requests.per.connection = 5
max.request.size = 1048576
metadata.max.age.ms = 300000
metadata.max.idle.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 2147483647
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
socket.connection.setup.timeout.max.ms = 127000
socket.connection.setup.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
ssl.endpoint.identification.algorithm = https
ssl.engine.factory.class = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.certificate.chain = null
ssl.keystore.key = null
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLSv1.3
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.certificates = null
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
transaction.timeout.ms = 60000
transactional.id = null
value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
2022-01-14 14:28:15 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 4852 ms, time limit is 2000 ms
2022-01-14 14:28:15 INFO AppInfoParser:119 - Kafka version: 2.7.0
2022-01-14 14:28:15 INFO AppInfoParser:120 - Kafka commitId: 448719dc99a19793
2022-01-14 14:28:15 INFO AppInfoParser:121 - Kafka startTimeMs: 1642170495945
2022-01-14 14:28:16 INFO ProducerConfig:361 - ProducerConfig values:
acks = 0
batch.size = 16384
bootstrap.servers = [my-cluster-kafka-bootstrap:9092]
buffer.memory = 33554432
client.dns.lookup = use_all_dns_ips
client.id = producer-32
compression.type = none
connections.max.idle.ms = 540000
delivery.timeout.ms = 120000
enable.idempotence = false
interceptor.classes = []
internal.auto.downgrade.txn.commit = false
key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
linger.ms = 0
max.block.ms = 60000
max.in.flight.requests.per.connection = 5
max.request.size = 1048576
metadata.max.age.ms = 300000
metadata.max.idle.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 2147483647
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
socket.connection.setup.timeout.max.ms = 127000
socket.connection.setup.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
ssl.endpoint.identification.algorithm = https
ssl.engine.factory.class = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.certificate.chain = null
ssl.keystore.key = null
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLSv1.3
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.certificates = null
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
transaction.timeout.ms = 60000
transactional.id = null
value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
2022-01-14 14:28:16 INFO Metadata:279 - [Producer clientId=producer-31] Cluster ID: TCh6nJ0aQTCWqhar7prJjg
2022-01-14 14:28:16 INFO AppInfoParser:119 - Kafka version: 2.7.0
2022-01-14 14:28:16 INFO AppInfoParser:120 - Kafka commitId: 448719dc99a19793
2022-01-14 14:28:16 INFO AppInfoParser:121 - Kafka startTimeMs: 1642170496134
2022-01-14 14:28:16 INFO Metadata:279 - [Producer clientId=producer-32] Cluster ID: TCh6nJ0aQTCWqhar7prJjg
2022-01-14 14:28:16 INFO send:85 - [609416115] SEND Response: statusCode = 200, message = OK
2022-01-14 14:28:16 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 5852 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
at com.fasterxml.jackson.databind.SerializationFeature.enabledIn(SerializationFeature.java:472) ~[jackson-databind-2.11.3.jar:2.11.3]
at com.fasterxml.jackson.databind.SerializationConfig.initialize(SerializationConfig.java:735) ~[jackson-databind-2.11.3.jar:2.11.3]
at com.fasterxml.jackson.databind.ObjectMapper.createGenerator(ObjectMapper.java:1134) ~[jackson-databind-2.11.3.jar:2.11.3]
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsBytes(ObjectMapper.java:3687) ~[jackson-databind-2.11.3.jar:2.11.3]
at io.vertx.core.json.jackson.DatabindCodec.toBuffer(DatabindCodec.java:182) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.json.JsonObject.toBuffer(JsonObject.java:747) ~[vertx-core-4.1.0.jar:4.1.0]
at io.strimzi.kafka.bridge.http.converter.HttpJsonMessageConverter.toKafkaRecords(HttpJsonMessageConverter.java:78) ~[kafka-bridge-0.20.1.jar:0.20.1]
at io.strimzi.kafka.bridge.http.converter.HttpJsonMessageConverter.toKafkaRecords(HttpJsonMessageConverter.java:23) ~[kafka-bridge-0.20.1.jar:0.20.1]
at io.strimzi.kafka.bridge.http.HttpSourceBridgeEndpoint.handle(HttpSourceBridgeEndpoint.java:121) ~[kafka-bridge-0.20.1.jar:0.20.1]
at io.strimzi.kafka.bridge.http.HttpBridge.processProducer(HttpBridge.java:456) ~[kafka-bridge-0.20.1.jar:0.20.1]
at io.strimzi.kafka.bridge.http.HttpBridge.send(HttpBridge.java:266) ~[kafka-bridge-0.20.1.jar:0.20.1]
at io.strimzi.kafka.bridge.http.HttpBridge.access$000(HttpBridge.java:63) ~[kafka-bridge-0.20.1.jar:0.20.1]
at io.strimzi.kafka.bridge.http.HttpBridge$1.process(HttpBridge.java:628) ~[kafka-bridge-0.20.1.jar:0.20.1]
at io.strimzi.kafka.bridge.http.HttpOpenApiOperation.handle(HttpOpenApiOperation.java:34) ~[kafka-bridge-0.20.1.jar:0.20.1]
at io.strimzi.kafka.bridge.http.HttpOpenApiOperation.handle(HttpOpenApiOperation.java:17) ~[kafka-bridge-0.20.1.jar:0.20.1]
at io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1127) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:114) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.impl.RoutingContextImpl.next(RoutingContextImpl.java:133) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.api.validation.impl.BaseValidationHandler.handle(BaseValidationHandler.java:94) ~[vertx-web-api-contract-4.1.0.jar:4.1.0]
at io.vertx.ext.web.api.validation.impl.BaseValidationHandler.handle(BaseValidationHandler.java:20) ~[vertx-web-api-contract-4.1.0.jar:4.1.0]
at io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1127) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:114) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.impl.RoutingContextImpl.next(RoutingContextImpl.java:133) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.handler.impl.ResponseContentTypeHandlerImpl.handle(ResponseContentTypeHandlerImpl.java:54) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.handler.impl.ResponseContentTypeHandlerImpl.handle(ResponseContentTypeHandlerImpl.java:28) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1127) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:151) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.impl.RoutingContextImpl.next(RoutingContextImpl.java:133) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.handler.impl.BodyHandlerImpl$BHandler.doEnd(BodyHandlerImpl.java:313) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.handler.impl.BodyHandlerImpl$BHandler.end(BodyHandlerImpl.java:290) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.handler.impl.BodyHandlerImpl.lambda$handle$0(BodyHandlerImpl.java:86) ~[vertx-web-4.1.0.jar:4.1.0]
at io.vertx.ext.web.handler.impl.BodyHandlerImpl$$Lambda$369/0x000000084038e040.handle(Unknown Source) ~[?:?]
at io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:96) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:59) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.http.impl.HttpEventHandler.handleEnd(HttpEventHandler.java:76) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.http.impl.Http1xServerRequest.onEnd(Http1xServerRequest.java:556) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.http.impl.Http1xServerRequest.handleEnd(Http1xServerRequest.java:542) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.http.impl.Http1xServerConnection$$Lambda$370/0x000000084038d840.handle(Unknown Source) ~[?:?]
at io.vertx.core.impl.EventLoopContext.execute(EventLoopContext.java:70) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.impl.DuplicatedContext.execute(DuplicatedContext.java:162) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.http.impl.Http1xServerConnection.onEnd(Http1xServerConnection.java:199) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.http.impl.Http1xServerConnection.onContent(Http1xServerConnection.java:186) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.http.impl.Http1xServerConnection.handleOther(Http1xServerConnection.java:156) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.http.impl.Http1xServerConnection.handleMessage(Http1xServerConnection.java:144) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.net.impl.ConnectionBase.read(ConnectionBase.java:155) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:154) ~[vertx-core-4.1.0.jar:4.1.0]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.handler.codec.http.websocketx.extensions.WebSocketServerExtensionHandler.channelRead(WebSocketServerExtensionHandler.java:99) ~[netty-codec-http-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) ~[netty-codec-4.1.65.Final.jar:4.1.65.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) ~[netty-codec-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.vertx.core.http.impl.Http1xOrH2CHandler.end(Http1xOrH2CHandler.java:61) ~[vertx-core-4.1.0.jar:4.1.0]
at io.vertx.core.http.impl.Http1xOrH2CHandler.channelRead(Http1xOrH2CHandler.java:38) ~[vertx-core-4.1.0.jar:4.1.0]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[netty-common-4.1.65.Final.jar:4.1.65.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.65.Final.jar:4.1.65.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.65.Final.jar:4.1.65.Final]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
Wondering how big is the content you are sending through the producer ...
@amitoj any updates on this one? or you managed to fix it and we can close it?
@amitoj I am going to close it, feel free to re-open if it's still an issue for you.