async-http-client Duplicate handler name: request-body-streamer

Discussed in #1627, but spinning off as it's a separate issue.

Under load, when using the NettyReactiveStreamsBody, I can reliably reproduce the following error:

[info] java.lang.IllegalArgumentException: Duplicate handler name: request-body-streamer
[info] 	at io.netty.channel.DefaultChannelPipeline.checkDuplicateName(DefaultChannelPipeline.java:1066)
[info] 	at io.netty.channel.DefaultChannelPipeline.filterName(DefaultChannelPipeline.java:284)
[info] 	at io.netty.channel.DefaultChannelPipeline.addLast(DefaultChannelPipeline.java:204)
[info] 	at io.netty.channel.DefaultChannelPipeline.addLast(DefaultChannelPipeline.java:195)
[info] 	at org.asynchttpclient.netty.request.body.NettyReactiveStreamsBody.write(NettyReactiveStreamsBody.java:59)
[info] 	at org.asynchttpclient.netty.request.NettyRequestSender.writeRequest(NettyRequestSender.java:426)
[info] 	at org.asynchttpclient.netty.request.NettyRequestSender.sendRequestWithOpenChannel(NettyRequestSender.java:259)
[info] 	at org.asynchttpclient.netty.request.NettyRequestSender.sendRequestWithCertainForceConnect(NettyRequestSender.java:141)
[info] 	at org.asynchttpclient.netty.request.NettyRequestSender.sendRequest(NettyRequestSender.java:113)
[info] 	at org.asynchttpclient.DefaultAsyncHttpClient.execute(DefaultAsyncHttpClient.java:241)
[info] 	at org.asynchttpclient.DefaultAsyncHttpClient.executeRequest(DefaultAsyncHttpClient.java:210)

I have a Scala reproduction with http4s. It can be seen with sbt run.

The problem is that that AHC offers the channel back to the pool manager as soon as the last response chunk is read, which causes a race condition against the NettySubscriber's completion. If removeFromPipeline is not called before the channel is offered back to the pool and polled out to another reactive streams request, we hit the exception.

A workaround is to remove the pipeline in a hook in AsyncHandler:

      override def onConnectionOffer(channel: Channel) {
        try channel.pipeline.remove("request-body-streamer")
        catch {
          case nse: NoSuchElementException => // this is just a cleanup of last resort
        }
      }

This works, but is bad for a couple reasons:

It runs regardless of the body type of the request.
The name of the pipeline is a private detail of the NettySubscriber.

Alternatively, NettyReactiveStreamsBody could remove the existing one if the addLast call fails, and then retry adding a subscriber. This isn't the cleanest solution, but at least keeps it to one class. I'm going with the onConnectionOffer hack for now, but can work on a PR for this alternative solution unless there's a better hook that I'm missing.

Aug 12 '19 03:08 rossabaker

Honestly, reactive streams support was contributed by people from Lightbend, but there's been nobody on the line since then. Please contribute if you can.

Aug 12 '19 14:08 slandelle

My solution has revealed a new problem:

[error]  java.lang.IllegalStateException: unexpected message type: DefaultHttpRequest, state: 1 (HttpObjectEncoder.java:86)
[error] io.netty.handler.codec.http.HttpObjectEncoder.encode(HttpObjectEncoder.java:86)
[error] io.netty.handler.codec.http.HttpClientCodec$Encoder.encode(HttpClientCodec.java:167)
[error] io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:89)
[error] io.netty.channel.CombinedChannelDuplexHandler.write(CombinedChannelDuplexHandler.java:348)
[error] io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
[error] io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
[error] io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816)
[error] io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
[error] io.netty.handler.stream.ChunkedWriteHandler.doFlush(ChunkedWriteHandler.java:324)
[error] io.netty.handler.stream.ChunkedWriteHandler.flush(ChunkedWriteHandler.java:135)
[error] io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
[error] io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
[error] io.netty.channel.AbstractChannelHandlerContext.access$1500(AbstractChannelHandlerContext.java:38)
[error] io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1175)
[error] io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1098)
[error] io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
[error] io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
[error] io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:495)
[error] io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905)

I'm afraid that my fix is preventing it from writing the LastHttpContent.EMPTY_LAST_CONTENT in the subscriber's complete() handler. I think we are recycling the channel based on receiving the last chunk of response, but also need to await the end of the request. Does that sound plausible?

Aug 12 '19 14:08 rossabaker

I'm seeing a similar stack trace while using http4s with the - very frequently on one of six test networks (maybe 1 in 10), but rarely everywhere else - including production networks (maybe 1 in 10000 - hard to measure), deep in the bowls of Netty, when using http4s' AsyncHttpClient client :

Caused by: java.lang.IllegalStateException: unexpected message type: DefaultHttpRequest, state: 1
 at io.netty.handler.codec.http.HttpObjectEncoder.encode(HttpObjectEncoder.java:86)
at io.netty.handler.codec.http.HttpClientCodec$Encoder.encode(HttpClientCodec.java:167)
at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:89)

I have only observed incidents of the problem when sending a request with a body, just after sending (and receiving the result) of a request with no body. I think that's my best clue, as the code in the netty library is line 86 in https://github.com/netty/netty/blob/00afb19d7a37de21b35ce4f6cb3fa7f74809f2ab/codec-http/src/main/java/io/netty/handler/codec/http/HttpObjectEncoder.java :

           if (state != ST_INIT) {
                throw new IllegalStateException("unexpected message type: " + StringUtil.simpleClassName(msg)
                        + ", state: " + state);
            }

ST_INIT is from this family of constants:

    private static final int ST_INIT = 0;
    private static final int ST_CONTENT_NON_CHUNK = 1;
    private static final int ST_CONTENT_CHUNK = 2;
    private static final int ST_CONTENT_ALWAYS_EMPTY = 3;

1 is ST_CONTENT_NON_CHUNK . I interpret that as "the encoder really needs to be in ST_INIT" or "ST_CONTENT_NON_CHUNK" should be OK but this code prevents the rest of the code from using it."

I've made changes in our code base to wait a few seconds, then try the http request again. That mitigates the problem, but doesn't fix it, and doesn't explain the mystery on that one test system. My opening guess is netty is trying to reuse some object from the previous interaction, and that object is in the wrong state.

I think this ticket is related since Ross wrote so much of the http4s code.

Is there something better than "try again in a few seconds" to mitigate the problem?

Nov 18 '19 21:11 dwalend

There's a hint of something similar here: https://github.com/netty/netty/issues/7993 that I was able to follow to https://github.com/reactor/reactor-netty/commit/13fbcdac6b8cef66272eac691a63db68102b3e43 , but I'm a bit out of my depth here.

Nov 18 '19 21:11 dwalend

I'm also observing this under load with

http4s 0.21.7
async-http-client 2.10.5
fs2 2.4.4

Sep 29 '20 14:09 Daenyth

Is this benign or could it cause re-sending / receiving data that has been sent / received before?

Apr 21 '21 13:04 beckyconning

Becky - I haven't observed that, but haven't chased it either. Anecdotally I've not observed a request recovering on its own, but haven't looked for that.

(Our data model handles 0-or-more gracefully (0 is problematic, 2 or more is wasteful but otherwise fine). We'd have to hunt for that aspect problem to observe it. We work around the bug by observing the exception, waiting 5 seconds, then attempting an http identical request. I haven't observed needing more than 3 attempts.)

Not benign - "bounded" maybe. Incidents of the bug tend to clump together. I believe it's a systemic problem: when the system is under pressure it happens frequently. When resources are available we rarely see the problem.

Apr 21 '21 13:04 dwalend

We've faced that issue too and I believe I have a fix for that here. @rossabaker Your reproducer is not there anymore, can you check that commit please?

May 09 '22 15:05 Squiry

Sorry, I cleaned up my GitHub a while back, and don't have that repo anymore.

May 10 '22 15:05 rossabaker

I've seen this issue today @Squiry - is your fix still working out well for you? Should we raise a PR?

Sep 14 '22 07:09 henricook

This fix does work in my case. I will create a PR, but I'm not sure about state of this project tbh.

Sep 14 '22 19:09 Squiry

async-http-client async-http-client copied to clipboard

Duplicate handler name: request-body-streamer

async-http-client
async-http-client copied to clipboard