aws-sdk-java-v2 icon indicating copy to clipboard operation
aws-sdk-java-v2 copied to clipboard

S3 future cancelation: error not propagated

Open chibenwa opened this issue 3 years ago • 3 comments

Describe the bug

Using OVH S3 services, I encounter some timeout issues while the S3 response is partially received.

The failure results in a driver log but fails to be propagated for correct handling to my applicative code. Instead the future is abruptly canceled and my applicative code need to infer "cancelled future == failure".

I would prefer the error handling here to be explicit, and the XML unmarshalling exception to be propagated upon this kind of timeout.

Expected Behavior

I would prefer the error handling here to be explicit, and the XML unmarshalling exception to be propagated upon this kind of timeout.

Current Behavior

Using OVH S3 services, I encounter some timeout issues while the S3 response is partially received.

The failure results in a driver log but fails to be propagated for correct handling to my applicative code. Instead the future is abruptly canceled and my applicative code need to infer "cancelled future == failure".

StackTrace:

software.amazon.awssdk.services.s3.model.S3Exception: null (Service: S3, Status Code: 400, Request ID: null)
 at software.amazon.awssdk.services.s3.model.S3Exception$BuilderImpl.build(S3Exception.java:95)
  at software.amazon.awssdk.services.s3.model.S3Exception$BuilderImpl.build(S3Exception.java:55)
   at software.amazon.awssdk.protocols.query.internal.unmarshall.AwsXmlErrorUnmarshaller.unmarshall(AwsXmlErrorUnmarshaller.java:99)
    at software.amazon.awssdk.protocols.query.unmarshall.AwsXmlErrorProtocolUnmarshaller.handle(AwsXmlErrorProtocolUnmarshaller.java:102)
     at software.amazon.awssdk.protocols.query.unmarshall.AwsXmlErrorProtocolUnmarshaller.handle(AwsXmlErrorProtocolUnmarshaller.java:82)
      at software.amazon.awssdk.core.http.MetricCollectingHttpResponseHandler.lambda$handle$0(MetricCollectingHttpResponseHandler.java:52)
       at software.amazon.awssdk.core.internal.util.MetricUtils.measureDurationUnsafe(MetricUtils.java:63)
        at software.amazon.awssdk.core.http.MetricCollectingHttpResponseHandler.handle(MetricCollectingHttpResponseHandler.java:52)
         at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler.lambda$prepare$0(AsyncResponseHandler.java:89)
          at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(Unknown Source)
           at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
            at java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source)
             at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler$BaosSubscriber.onComplete(AsyncResponseHandler.java:132)
              at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$DataCountingPublisher$1.onComplete(ResponseHandler.java:513)
               at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.runAndLogError(ResponseHandler.java:250)
                at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.access$600(ResponseHandler.java:75)
                 at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$PublisherAdapter$1.onComplete(ResponseHandler.java:371)
                  at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.complete(HandlerPublisher.java:447)
                   at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.channelInactive(HandlerPublisher.java:430)
                    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
                     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
                      at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
                       at io.netty.handler.logging.LoggingHandler.channelInactive(LoggingHandler.java:206)
                        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
                         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
                          at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
                           at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81)
                            at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277)
                             at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
                              at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
                               at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
                                at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:418)
                                 at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:392)
                                  at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:357)
                                   at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:326)
                                    at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:221)
                                     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
                                      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
                                       at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
                                        at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:392)
                                         at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:357)
                                          at io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1074)
                                           at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
                                            at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
                                             at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
                                              at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81)
                                               at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277)
                                                at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
                                                 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
                                                  at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
                                                   at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405)
                                                    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
                                                     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
                                                      at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901)
                                                       at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:813)
                                                        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
                                                         at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
                                                          at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
                                                           at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
                                                            at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:995)
                                                             at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
                                                              at java.base/java.lang.Thread.run(Unknown Source) 

Reproduction Steps

I tried writing some unit tests while pausing a docker container but that only triggers a read timeout on the driver.

The issue is likely caused by a timeout during the read response, and I did not succeed to reproduce that (yet)

Possible Solution

Propagate the exception. Avoid just canceling the future upon exceptions.

Additional Information/Context

https://issues.apache.org/jira/browse/JAMES-3800

AWS Java SDK version used

2.17.198

JDK version used

openjdk version "11.0.15" 2022-04-19 OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1) OpenJDK 64-Bit Server VM (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1, mixed mode, sharing)

Operating System and version

Ubuntu 20.04.4 LTS

chibenwa avatar Aug 10 '22 03:08 chibenwa

Hello @chibenwa ,

Thank you very much for your submission. I will bring this up to the team and update you on the status of the issue here.

Best,

Yasmine

yasminetalby avatar Aug 12 '22 22:08 yasminetalby

Hello @chibenwa ,

Update: This has been added to the team backlog. Thank you very much for your submission.

Best,

Yasmine

yasminetalby avatar Aug 15 '22 20:08 yasminetalby

Hi @chibenwa, thank you for your patience. We just have some follow up questions:

  1. Do you mean the return future was cancelled? Could you share the stacktrace of the return future?

  2. Looking at the stacktrace, it seems the response was fully received. If the SDK did not receive the whole response, a different exception would be thrown. What operation was this? If S3 did not return error payload, the SDK would just return a generic S3Exception, which seems to be the behavior in this case.

zoewangg avatar Sep 19 '22 21:09 zoewangg

It looks like this issue has not been active for more than five days. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please add a comment to prevent automatic closure, or if the issue is already closed please feel free to reopen it.

github-actions[bot] avatar Oct 10 '22 21:10 github-actions[bot]