pulsar
pulsar copied to clipboard
[Bug] bin/pulsar-perf will cause the pulsar service to freeze
Search before asking
- [X] I searched in the issues and found nothing similar.
Read release policy
- [X] I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.
Version
OS:Linux 4.19.91-24.8.an8.x86_64 #1 SMP Tue Aug 31 11:30:53 CST 2021 x86_64 x86_64 x86_64 GNU/Linux Java:java version "17.0.11" 2024-04-16 LTS Pulsar:v3.0.5
Minimal reproduce step
- start pulsar cluster
- create a persistent topic
- use bin/pulsar-perf produce tool to produce some produce message, as below
bin/pulsar-perf produce \
--admin-url $cluster_host \
--auth-params "token: $token" \
--auth-plugin org.apache.pulsar.client.impl.auth.AuthenticationToken \
--num-messages $num \
--num-producers 10 \
--batch-max-messages 100 \
persistent://$tenant_name/$namespace_name/$topic_name
- Lookup request timeout, as below:
- Worse, this will cause this topic to never produce messages, even if you use the pulsar client SDK
java.util.concurrent.ExecutionException: org.apache.pulsar.client.api.PulsarClientException$TimeoutException: Lookup request timeout {'durationMs': '30000', 'reqId':'3305977965004565187', 'remote':'xx.xx.xx.xx/xx.xx.xx.xx:6650', 'local':'/xx.xx.xx.xx:64755'}
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) ~[?:?]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) ~[?:?]
at org.apache.pulsar.testclient.PerformanceProducer.runProducer(PerformanceProducer.java:598) ~[org.apache.pulsar-pulsar-testclient-3.0.5.jar:3.0.5]
at org.apache.pulsar.testclient.PerformanceProducer.lambda$main$1(PerformanceProducer.java:399) ~[org.apache.pulsar-pulsar-testclient-3.0.5.jar:3.0.5]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty-netty-common-4.1.108.Final.jar:4.1.108.Final]
at java.lang.Thread.run(Thread.java:833) ~[?:?]
Caused by: org.apache.pulsar.client.api.PulsarClientException$TimeoutException: Lookup request timeout {'durationMs': '30000', 'reqId':'3305977965004565187', 'remote':'xx.xx.xx.xx/xx.xx.xx.xx:6650', 'local':'/xx.xx.xx.xx:64755'}
at org.apache.pulsar.client.impl.ClientCnx.checkRequestTimeout(ClientCnx.java:1355) ~[org.apache.pulsar-pulsar-client-original-3.0.5.jar:3.0.5]
at org.apache.pulsar.common.util.Runnables$CatchingAndLoggingRunnable.run(Runnables.java:54) ~[org.apache.pulsar-pulsar-common-3.0.5.jar:3.0.5]
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[io.netty-netty-common-4.1.108.Final.jar:4.1.108.Final]
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:159) ~[io.netty-netty-common-4.1.108.Final.jar:4.1.108.Final]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) ~[io.netty-netty-common-4.1.108.Final.jar:4.1.108.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) ~[io.netty-netty-common-4.1.108.Final.jar:4.1.108.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[io.netty-netty-common-4.1.108.Final.jar:4.1.108.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) ~[io.netty-netty-transport-4.1.108.Final.jar:4.1.108.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[io.netty-netty-common-4.1.108.Final.jar:4.1.108.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[io.netty-netty-common-4.1.108.Final.jar:4.1.108.Final]
... 2 more
What did you expect to see?
pulsar can produce and consume messages normally
What did you see instead?
Pulsar is stuck and times out when producing messages
Anything else?
No response
Are you willing to submit a PR?
- [X] I'm willing to submit a PR!