camunda
camunda copied to clipboard
BackupUploadIT.shouldSaveBackupWithManyFiles is flaky
Summary
- How often does the test fail? - Once so far.
- Does it block your work? - No, passed on retry.
- Do we suspect that it is a real failure? - Not sure
Failures
This failure happened in a backport to 8.3. https://github.com/camunda/zeebe/actions/runs/8922653492/job/24505245262?pr=18123
Example assertion failure
Expectingto be completed within 1M. exception caught while trying to get the future result: java.util.concurrent.ExecutionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Duplicate handler name: HttpStreamsClientHandler#0-body-subscriber at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2096) at org.assertj.core.internal.Futures.assertSucceededWithin(Futures.java:109) at org.assertj.core.api.AbstractCompletableFutureAssert.internalSucceedsWithin(AbstractCompletableFutureAssert.java:400) at org.assertj.core.api.AbstractCompletableFutureAssert.succeedsWithin(AbstractCompletableFutureAssert.java:396) at io.camunda.zeebe.backup.s3.BackupUploadIT.shouldSaveBackupWithManyFiles(BackupUploadIT.java:95) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728) at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60) at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131) at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156) at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147) at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93) at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92) at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:218) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:214) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:139) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:69) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141) at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138) at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95) at org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService$ExclusiveTask.compute(ForkJoinPoolHierarchicalTestExecutorService.java:202) at java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:194) at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182) at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655) at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165) Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Duplicate handler name: HttpStreamsClientHandler#0-body-subscriber at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111) at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47) at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:223) at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:218) at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:182) at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage.java:159) at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:79) at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$null$0(MakeAsyncHttpRequestStage.java:103) at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.completeResponseFuture(MakeAsyncHttpRequestStage.java:240) at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$executeHttpRequest$3(MakeAsyncHttpRequestStage.java:163) at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:934) at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:911) at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) Caused by: java.lang.IllegalArgumentException: Duplicate handler name: HttpStreamsClientHandler#0-body-subscriber at io.netty.channel.DefaultChannelPipeline.checkDuplicateName(DefaultChannelPipeline.java:1055) at io.netty.channel.DefaultChannelPipeline.filterName(DefaultChannelPipeline.java:284) at io.netty.channel.DefaultChannelPipeline.addAfter(DefaultChannelPipeline.java:301) at io.netty.channel.DefaultChannelPipeline.addAfter(DefaultChannelPipeline.java:290) at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsHandler.unbufferedWrite(HttpStreamsHandler.java:329) at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsHandler.flushNext(HttpStreamsHandler.java:376) at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsHandler.write(HttpStreamsHandler.java:270) at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsClientHandler.write(HttpStreamsClientHandler.java:59) at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:891) at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:956) at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:982) at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:950) at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:1000) at io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:1025) at io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:306) at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.writeRequest(NettyRequestExecutor.java:247) at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.makeRequest(NettyRequestExecutor.java:239) at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.lambda$makeRequestListener$10(NettyRequestExecutor.java:182) at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106) at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ... 1 more
Hypotheses
Logs
Logs
LOGS
Happened again here
Exception
<CompletableFuture[Failed with the following stack trace:
software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Duplicate handler name: HttpStreamsClientHandler#0-body-subscriber
at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)
at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:223)
at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:218)
at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:182)
at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage.java:159)
at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194)
at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:79)
at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194)
at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$execute$0(MakeAsyncHttpRequestStage.java:108)
at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194)
at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.completeResponseFuture(MakeAsyncHttpRequestStage.java:255)
at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$executeHttpRequest$3(MakeAsyncHttpRequestStage.java:167)
at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:934)
at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:911)
at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.lang.IllegalArgumentException: Duplicate handler name: HttpStreamsClientHandler#0-body-subscriber
at io.netty.channel.DefaultChannelPipeline.checkDuplicateName(DefaultChannelPipeline.java:1055)
at io.netty.channel.DefaultChannelPipeline.filterName(DefaultChannelPipeline.java:284)
at io.netty.channel.DefaultChannelPipeline.addAfter(DefaultChannelPipeline.java:301)
at io.netty.channel.DefaultChannelPipeline.addAfter(DefaultChannelPipeline.java:290)
at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsHandler.unbufferedWrite(HttpStreamsHandler.java:329)
at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsHandler.flushNext(HttpStreamsHandler.java:376)
at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsHandler.write(HttpStreamsHandler.java:270)
at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsClientHandler.write(HttpStreamsClientHandler.java:59)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:891)
at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:956)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:982)
at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:950)
at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:1000)
at io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:1025)
at io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:306)
at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.writeRequest(NettyRequestExecutor.java:245)
at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.makeRequest(NettyRequestExecutor.java:237)
at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.lambda$makeRequestListener$10(NettyRequestExecutor.java:181)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
... 1 more
]>
to be completed within 1M.
exception caught while trying to get the future result: java.util.concurrent.ExecutionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Duplicate handler name: HttpStreamsClientHandler#0-body-subscriber
at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2096)
at org.assertj.core.internal.Futures.assertSucceededWithin(Futures.java:109)
at org.assertj.core.api.AbstractCompletableFutureAssert.internalSucceedsWithin(AbstractCompletableFutureAssert.java:400)
at org.assertj.core.api.AbstractCompletableFutureAssert.succeedsWithin(AbstractCompletableFutureAssert.java:396)
at io.camunda.zeebe.backup.s3.BackupUploadIT.shouldSaveBackupWithManyFiles(BackupUploadIT.java:95)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728)
at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:218)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:214)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:139)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:69)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
at org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService$ExclusiveTask.compute(ForkJoinPoolHierarchicalTestExecutorService.java:202)
at java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:194)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Duplicate handler name: HttpStreamsClientHandler#0-body-subscriber
at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)
at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:223)
at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:218)
at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:182)
at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage.java:159)
at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194)
at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:79)
at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194)
at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$execute$0(MakeAsyncHttpRequestStage.java:108)
at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194)
at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.completeResponseFuture(MakeAsyncHttpRequestStage.java:255)
at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$executeHttpRequest$3(MakeAsyncHttpRequestStage.java:167)
at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:934)
at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:911)
at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.lang.IllegalArgumentException: Duplicate handler name: HttpStreamsClientHandler#0-body-subscriber
at io.netty.channel.DefaultChannelPipeline.checkDuplicateName(DefaultChannelPipeline.java:1055)
at io.netty.channel.DefaultChannelPipeline.filterName(DefaultChannelPipeline.java:284)
at io.netty.channel.DefaultChannelPipeline.addAfter(DefaultChannelPipeline.java:301)
at io.netty.channel.DefaultChannelPipeline.addAfter(DefaultChannelPipeline.java:290)
at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsHandler.unbufferedWrite(HttpStreamsHandler.java:329)
at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsHandler.flushNext(HttpStreamsHandler.java:376)
at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsHandler.write(HttpStreamsHandler.java:270)
at software.amazon.awssdk.http.nio.netty.internal.nrs.HttpStreamsClientHandler.write(HttpStreamsClientHandler.java:59)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:891)
at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:956)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:982)
at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:950)
at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:1000)
at io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:1025)
at io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:306)
at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.writeRequest(NettyRequestExecutor.java:245)
at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.makeRequest(NettyRequestExecutor.java:237)
at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.lambda$makeRequestListener$10(NettyRequestExecutor.java:181)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
... 1 more
at io.camunda.zeebe.backup.s3.BackupUploadIT.shouldSaveBackupWithManyFiles(BackupUploadIT.java:95)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:194)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
Happened again here https://github.com/camunda/zeebe/actions/runs/8925112776/job/24513039306?pr=18184
also reported here https://github.com/camunda/zeebe/issues/18185
seems relevant https://github.com/aws/aws-sdk-java-v2/issues/1903
Also seen this multiple times: https://github.com/camunda/zeebe/actions/runs/8926375388/job/24517264052
could easily reproduce locally, second run failed 😆 As we didn't change anything in regards to backups in a longer time I suspect this could be related to a recent aws sdk update.
Given the frequency of it's occurrence I will validate this shortly.
rollback doesn't help, will disable the test for now to let someone else continue deeper investigation next
Happened again twice today: https://camunda.slack.com/archives/C013MEVQ4M9/p1714671606674149 and https://camunda.slack.com/archives/C013MEVQ4M9/p1714714195395769
disabled the test for now as it can be easily reproduced locally to continue the investigation https://github.com/camunda/zeebe/pull/18205 and to eliminate the disruption it causes
Further findings: It is relatively easy to reproduce the test locally, by just rerunning until failure the test. (It takes on average less than 10 times on my machine).
Similar issues:
- https://github.com/linagora/james-project/issues/4819
- https://github.com/http4s/http4s-async-http-client/issues/1
- https://github.com/AsyncHttpClient/async-http-client/issues/1627
Some of the discussion in these issues suggests that the problem is that the AHC offers the channel back to the pool manager as soon as the last response chunk is read, which causes a race condition against the NettySubscriber's completion. Or basically we are cycling through the available channels and uploading objects faster than these can be cleaned and reused.
Hence the error message “Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Duplicate handler name: HttpStreamsClientHandler#0-body-subscriber”.
This seems to be supported by the fact that relates to the speed which we are sending new objects. When locally testing the following things reduced the probability of of this error:
- Reducing the number of parallel connections that we use to upload the objects.
- Increasing the size of the objects uploaded (currently empty ones).
- Making the test slower (with thread.sleep() for example).
None of the previous changes remove completely the occurrence of the incident, just make it rarer.
I don't know a solution for this issue since this seems to be a problem on the AsyncHttpClient side which was not solved yet. I will put this issue back on triage.
I'm a little bit confused. Where do we use AsyncHttpClient
? From what I can see, the AWS SDK uses its own client which is based on Apache or Netty, but both are still owned and maintained by AWS, so I don't know if the issues above are directly related to our own.
I did a mvn dependency:tree | grep async-http-client
just to be sure we don't pull it in anywhere, and came up with nothing. Can you show me how the issues above are related?
@rodrigo-lourenco-lopes - did you see my question?
Sorry I got the libraries confused while I was writing, I meant to say the one from AWS.
While looking again at this issue it seems that in the linked issues they already found the origin of the problem https://github.com/aws/aws-sdk-java-v2/issues/1903#issuecomment-2249693722.
Apparently if a client sends Expect: 100-continue with Content-Length: 0, a server can send the response right away. This is specified in https://datatracker.ietf.org/doc/html/rfc7231#section-5.1.1
In this specific test we were sending 4 000 empty files, we are getting the response immediately and releasing the channels to the pool before these are complete, which makes the probability of using the same channel twice quite high.
I tried to run the same test without using empty files (16 Bytes for example) and the issue no longer happens (tested locally for more than 300 runs, while without the fix, it fails every 5 or 10 runs). I will create a pull request for this.