dubbo icon indicating copy to clipboard operation
dubbo copied to clipboard

[Bug] Is there any graceful way to prevent running out of http2 stream id in Dubbo3.x

Open PaulTan94 opened this issue 10 months ago • 13 comments

Pre-check

  • [X] I am sure that all the content I provide is in English.

Search before asking

  • [X] I had searched in the issues and found no similar issues.

Apache Dubbo Component

Java SDK (apache/dubbo)

Dubbo Version

Dubbo 3.x

Steps to reproduce this issue

Based on the discussion in #13731, I think it is an improper way to make a reconnection triggered by GoAwayFrame(stream id exhausted) cause there will be 100% request failure before the new connection being active and it is unacceptable for production scenario. To make sure the connection is consistently available, is it possible to make a new connection before the stream id is about to be exhausted?

What you expected to happen

  1. Make a new connection before the stream id is about to be exhausted.

Anything else

No response

Are you willing to submit a pull request to fix on your own?

  • [X] Yes I am willing to submit a pull request on my own!

Code of Conduct

PaulTan94 avatar Apr 17 '24 12:04 PaulTan94

@EarthChen @icodening PTAL

BTW, I also want to know if there is a good way

AlbumenJ avatar May 08 '24 03:05 AlbumenJ

I think should be reconnect when throw Http2NoMoreStreamIdsException, and the message will be suspend until stream is create success.

icodening avatar May 18 '24 13:05 icodening

Is it possible to proactively reconnect instead of waiting for an exception to occur?

oxsean avatar May 18 '24 14:05 oxsean

According to https://datatracker.ietf.org/doc/rfc9113/ 5.1.1

Stream identifiers cannot be reused. Long-lived connections can result in an endpoint exhausting the available range of stream identifiers. A client that is unable to establish a new stream identifier can establish a new connection for new streams. A server that is unable to establish a new stream identifier can send a GOAWAY frame so that the client is forced to open a new connection for new streams.

May be Http2NoMoreStreamIdsException is not required, we should reconnect automatically

FoghostCn avatar May 19 '24 01:05 FoghostCn

When stream exhaust occur, Server send GOAWAY frame to prevent client-endpoint to open a new stream with higher stream id.

GOAWAY is a part of http/2 error-handling. We may handle it as HTTP/2 transportation layer error.

Currently TripleGoAwayHandler#channelRead, NettyConnectionHandler#onGoAway, NettyConnectionClient#onGoaway implements close-and-reconnect on connection layer.

Suggest to raise RpcException with code NETWORK_EXCEPTION to trigger non-biz RPC call retry mechansim.

Chenjp avatar Jun 06 '24 05:06 Chenjp

@PaulTan94 @oxsean @icodening @AlbumenJ I have an idea to handle GoAway gracefully, but need a testcase to cover. It's difficult to reproduce the scenario. Can you help?

Chenjp avatar Jun 21 '24 07:06 Chenjp

@PaulTan94 @oxsean @icodening @AlbumenJ I have an idea to handle GoAway gracefully, but need a testcase to cover. It's difficult to reproduce the scenario. Can you help?

Hi, there is a tricky way to control the stream id step in Netty

  1. Copy the io.netty.handler.codec.http2.DefaultHttp2Connection.java file.
  2. Change the default step(+2) in 'incrementAndGetNextStreamId' and 'incrementExpectedStreamId' method.
  3. Put the changed java file under your project's classpath, ClassLoader will premarily load the class under current project's classpath.

PaulTan94 avatar Jun 21 '24 09:06 PaulTan94

According to https://datatracker.ietf.org/doc/rfc9113/ 5.1.1

Stream identifiers cannot be reused. Long-lived connections can result in an endpoint exhausting the available range of stream identifiers. A client that is unable to establish a new stream identifier can establish a new connection for new streams. A server that is unable to establish a new stream identifier can send a GOAWAY frame so that the client is forced to open a new connection for new streams.

May be Http2NoMoreStreamIdsException is not required, we should reconnect automatically

The way mentioned in RFC9113 is mostly for Web scenario, right? For RPC scenario, it is much more proper to start a reconnection proactively instead of waiting for Http2NoMoreStreamIdsException.

PaulTan94 avatar Jun 23 '24 04:06 PaulTan94

I hava an idea of implementing a handler in Netty handler pipeline, which will monitor the latest stream id and trigger a channel switching (without traffic loss) once the stream id is about to reach the limit. I am not sure that is the right way, but anyway I will post a PR later.

PaulTan94 avatar Jun 23 '24 04:06 PaulTan94

@PaulTan94 @oxsean @icodening @AlbumenJ I have an idea to handle GoAway gracefully, but need a testcase to cover. It's difficult to reproduce the scenario. Can you help?

Hi, there is a tricky way to control the stream id step in Netty

  1. Copy the io.netty.handler.codec.http2.DefaultHttp2Connection.java file.
  2. Change the default step(+2) in 'incrementAndGetNextStreamId' and 'incrementExpectedStreamId' method.
  3. Put the changed java file under your project's classpath, ClassLoader will premarily load the class under current project's classpath.

It is not recommended to do this as the order of class loading cannot be guaranteed

icodening avatar Jun 24 '24 02:06 icodening

GoAway failure (caused by stream id exhaustion) occurs in one of billions probability, we need have a trade-off between fix-cost with introduce complexity and one-of-billion impact. I think NO NEED TO FIX IT. If you expect a 100% reliable system, it's better to deploy multiple service provider nodes and enable retry time configuration for service consumer.

Chenjp avatar Jul 03 '24 07:07 Chenjp

No news is good news. @AlbumenJ @PaulTan94 PTAL

Chenjp avatar Aug 05 '24 06:08 Chenjp

I think it's necessary to reconnecting, and this shouldn't cost much @Chenjp

I think should be reconnect when throw Http2NoMoreStreamIdsException, and the message will be suspend until stream is create success.

oxsean avatar Aug 05 '24 11:08 oxsean