dubbo
dubbo copied to clipboard
[Bug] Is there any graceful way to prevent running out of http2 stream id in Dubbo3.x
Pre-check
- [X] I am sure that all the content I provide is in English.
Search before asking
- [X] I had searched in the issues and found no similar issues.
Apache Dubbo Component
Java SDK (apache/dubbo)
Dubbo Version
Dubbo 3.x
Steps to reproduce this issue
Based on the discussion in #13731, I think it is an improper way to make a reconnection triggered by GoAwayFrame(stream id exhausted) cause there will be 100% request failure before the new connection being active and it is unacceptable for production scenario. To make sure the connection is consistently available, is it possible to make a new connection before the stream id is about to be exhausted?
What you expected to happen
- Make a new connection before the stream id is about to be exhausted.
Anything else
No response
Are you willing to submit a pull request to fix on your own?
- [X] Yes I am willing to submit a pull request on my own!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
@EarthChen @icodening PTAL
BTW, I also want to know if there is a good way
I think should be reconnect when throw Http2NoMoreStreamIdsException
, and the message will be suspend until stream is create success.
Is it possible to proactively reconnect instead of waiting for an exception to occur?
According to https://datatracker.ietf.org/doc/rfc9113/ 5.1.1
Stream identifiers cannot be reused. Long-lived connections can result in an endpoint exhausting the available range of stream identifiers. A client that is unable to establish a new stream identifier can establish a new connection for new streams. A server that is unable to establish a new stream identifier can send a GOAWAY frame so that the client is forced to open a new connection for new streams.
May be Http2NoMoreStreamIdsException
is not required, we should reconnect automatically
When stream exhaust occur, Server send GOAWAY frame to prevent client-endpoint to open a new stream with higher stream id.
GOAWAY is a part of http/2 error-handling. We may handle it as HTTP/2 transportation layer error.
Currently TripleGoAwayHandler#channelRead, NettyConnectionHandler#onGoAway, NettyConnectionClient#onGoaway
implements close-and-reconnect on connection layer.
Suggest to raise RpcException with code NETWORK_EXCEPTION to trigger non-biz RPC call retry mechansim.
@PaulTan94 @oxsean @icodening @AlbumenJ I have an idea to handle GoAway gracefully, but need a testcase to cover. It's difficult to reproduce the scenario. Can you help?
@PaulTan94 @oxsean @icodening @AlbumenJ I have an idea to handle GoAway gracefully, but need a testcase to cover. It's difficult to reproduce the scenario. Can you help?
Hi, there is a tricky way to control the stream id step in Netty
- Copy the io.netty.handler.codec.http2.DefaultHttp2Connection.java file.
- Change the default step(+2) in 'incrementAndGetNextStreamId' and 'incrementExpectedStreamId' method.
- Put the changed java file under your project's classpath, ClassLoader will premarily load the class under current project's classpath.
According to https://datatracker.ietf.org/doc/rfc9113/ 5.1.1
Stream identifiers cannot be reused. Long-lived connections can result in an endpoint exhausting the available range of stream identifiers. A client that is unable to establish a new stream identifier can establish a new connection for new streams. A server that is unable to establish a new stream identifier can send a GOAWAY frame so that the client is forced to open a new connection for new streams.
May be
Http2NoMoreStreamIdsException
is not required, we should reconnect automatically
The way mentioned in RFC9113 is mostly for Web scenario, right? For RPC scenario, it is much more proper to start a reconnection proactively instead of waiting for Http2NoMoreStreamIdsException
.
I hava an idea of implementing a handler in Netty handler pipeline, which will monitor the latest stream id and trigger a channel switching (without traffic loss) once the stream id is about to reach the limit. I am not sure that is the right way, but anyway I will post a PR later.
@PaulTan94 @oxsean @icodening @AlbumenJ I have an idea to handle GoAway gracefully, but need a testcase to cover. It's difficult to reproduce the scenario. Can you help?
Hi, there is a tricky way to control the stream id step in Netty
- Copy the io.netty.handler.codec.http2.DefaultHttp2Connection.java file.
- Change the default step(+2) in 'incrementAndGetNextStreamId' and 'incrementExpectedStreamId' method.
- Put the changed java file under your project's classpath, ClassLoader will premarily load the class under current project's classpath.
It is not recommended to do this as the order of class loading cannot be guaranteed
GoAway failure (caused by stream id exhaustion) occurs in one of billions probability, we need have a trade-off between fix-cost with introduce complexity and one-of-billion impact. I think NO NEED TO FIX IT. If you expect a 100% reliable system, it's better to deploy multiple service provider nodes and enable retry time configuration for service consumer.
No news is good news. @AlbumenJ @PaulTan94 PTAL
I think it's necessary to reconnecting, and this shouldn't cost much @Chenjp
I think should be reconnect when throw
Http2NoMoreStreamIdsException
, and the message will be suspend until stream is create success.