8371903: HttpClient: improve handling of HTTP/2 GOAWAY frames with error code
Problem
When the HTTP/2 client receives a GOAWAY frame with a non-zero error code, the current implementation discards both the error code and debug data. Users only see generic "Connection closed by peer" errors without any information about why the server terminated the connection.
Solution
Per RFC 9113 §5.4.1, a GOAWAY frame with a non-zero error code indicates a connection error requiring immediate closure. This fix:
- Distinguishes graceful shutdown (NO_ERROR) from connection errors
- Preserves error code and debug data in exception messages
-
Categorizes streams based on
lastStreamId:- Streams with ID >
lastStreamId: Marked as unprocessed for automatic retry - Streams with ID ≤
lastStreamId: Failed with detailed error information
- Streams with ID >
Changes
Core Implementation (Http2Connection.java):
- Modified
handleGoAway()to check error code and route appropriately - Added
handleGoAwayWithError()method that:- Extracts error code and debug data from GOAWAY frame
- Creates meaningful error messages with error name, hex code, and debug data
- Properly categorizes streams for retry or failure
Test Infrastructure:
-
Http2TestServerConnection.sendGoAway(int, int, byte[]): Supports custom error codes -
Http2TestExchangeImpl.getServerConnection(): Accessor for test handlers -
GoAwayWithErrorTest: Verifies proper error propagation
Example
Before: IOException: Connection closed by peer
After: IOException: Received GOAWAY with error code Protocol error (0x1): Invalid HEADERS frame
Testing
- New
GoAwayWithErrorTestpasses - Existing HTTP/2 tests unaffected (NO_ERROR path unchanged)
- Backward compatible (no public API changes)
Progress
- [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
Issue
- JDK-8371903: HttpClient: improve handling of HTTP/2 GOAWAY frames with error code (Bug - P4)
Reviewers
- Daniel Jeliński (@djelinski - Reviewer) Review applies to 6dc77e1b
- Daniel Fuchs (@dfuch - Reviewer) Review applies to 6068c1f9
- Jaikiran Pai (@jaikiran - Reviewer)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28632/head:pull/28632
$ git checkout pull/28632
Update a local copy of the PR:
$ git checkout pull/28632
$ git pull https://git.openjdk.org/jdk.git pull/28632/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 28632
View PR using the GUI difftool:
$ git pr show -t 28632
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28632.diff
Using Webrev
:wave: Welcome back ehs208! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
@ehs208 This change is no longer ready for integration - check the PR body for details.
@ehs208 The following label will be automatically applied to this pull request:
-
net
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.
Webrevs
- 09: Full - Incremental (af092399)
- 08: Full - Incremental (574efae4)
- 07: Full - Incremental (566dfe23)
- 06: Full - Incremental (6068c1f9)
- 05: Full - Incremental (112db211)
- 04: Full - Incremental (6dc77e1b)
- 03: Full - Incremental (88129888)
- 02: Full - Incremental (2b100424)
- 01: Full - Incremental (93c14b3d)
- 00: Full (4e6b2a82)
@djelinski
Thanks for the feedback! Made all the changes:
- Now using
cause.getCloseCause()for consistency - Merged sendGoAway methods as suggested
- Test now verifies unprocessed stream retries with lastStreamId=1
Let me know if anything else needs adjustment!
I'd really like to see some retried requests in the test. Please:
- change the requests to POST; idempotent requests like GET might be retried for other reasons, POST is only retried when unprocessed,
- send all 3 requests at once, so that there's at least a chance that they will be in progress when GOAWAY is sent,
- check that exactly one request fails. It doesn't have to be the first one, the client sometimes reorders the requests internally.
Modified
handleGoAway()to check error code and route appropriatelyAdded
handleGoAwayWithError()method that:
- Extracts error code and debug data from GOAWAY frame
- Creates meaningful error messages with error name, hex code, and debug data
- Properly categorizes streams for retry or failure
This looks quite similar to something I have proposed here:
- https://github.com/laeubi/java-http-client/issues/8
As mentioned there, I think having a generic Exception with a STRING(!) message is not really something useful. I would suggest to add an own exception type so the client code is able to extract further details in a semantic way.
Hi @laeubi, yes it's related to your report. The point of this PR is to bring HTTP2 on par with HTTP3, which throws an exception with a string message. Specialized exception can be handled as a future enhancement, but I'd like to see some evidence that it would be useful to users.
@djelinski Updated per your suggestions:
- Switched to POST requests (only retried when unprocessed)
- Sending all 3 requests at once to test race conditions
- Dynamic lastStreamId based on first arriving request
- Validates exactly 1 failure and 2 successes regardless of order
Test now correctly verifies unprocessed stream retry behavior.
@djelinski Fixed the intermittent test failures:
- Removed
close()call after GOAWAY to avoid connection reset errors - Server now relies on Http2TestServer for connection cleanup
- Tested 10 times locally, all passed
Should be stable now. Please review when you get a chance
@ehs208 thanks for that, the test looks stable now.
Looking at the test logs, I realized why the goAwaySentLatch.await was where you originally had it; now that it's after all 3 requests, the requests are sent on 3 distinct connections, which is not what we want. Please move it back to its original position (after sending first request). Sorry for my confusion.
@djelinski I moved the goAwaySentLatch.await() back to its original position (after the first request, before second/third).
However, after testing locally, I'm seeing the same behavior - all 3 requests create separate connections. I believe this is expected because:
- First request uses connection 1
- Server sends GOAWAY, connection 1 receives it
- Second and third requests are retried on new connections (2 and 3)
Since requests 2 and 3 are retries after GOAWAY, they naturally go to new connections. The await position doesn't change this - it just controls when the retries are sent.
Could you clarify what connection behavior you're expecting? The test is passing and correctly verifies that:
- Exactly 1 request fails with GOAWAY error details
- Exactly 2 requests succeed after retry
Let me know if I'm missing something!
@djelinski Oops, you're right! I moved goAwaySentLatch.await() back to its original position and now it works as expected - only 1 connection is created initially, and the other 2 are retries. Tested locally and all passed.
Right, we only create one connection initially, and requests 2 and 3 are usually sent over the first connection before they are retried on new connections. This can be observed in the server logs as resetting stream 3 as REFUSED_STREAM. Ideally the server should not send anything after sending the GOAWAY frame with error, but I think what we have is good enough,
@djelinski Thanks for the explanation. That makes sense. If there’s an opportunity in the future, I’d be happy to help clean that part up as well. Appreciate your guidance.
/integrate
@ehs208 Your change (at version 6dc77e1b266e30aa2e0b4c0bcf0067e06b65df0f) is now ready to be sponsored by a Committer.
@djelinski Oops, you're right! I moved
goAwaySentLatch.await()back to its original position and now it works as expected - only 1 connection is created initially, and the other 2 are retries. Tested locally and all passed.
FWIW - if the request succeeds (no exception thrown) you can check/assert whether two requests were sent on the same or different connections by comparing the value returned by HttpResponse::connectionLabel(); Though the result is an Optional our implementation will never return an empty optional.
@dfuch Thank you for the thorough review. I've applied Utils.adjustTimeout() to both timeout values as you suggested.
Also, thanks for the tip about HttpResponse::connectionLabel() - that's really helpful to know for verifying connection reuse in the future. For this test, I'll keep the current approach since it's already validating the retry behavior correctly, but I'll definitely keep that in mind for future tests.
/integrate
Thanks for the updates. LGTM.
Thank you!
@ehs208 Your change (at version 6068c1f9710bde3545c91faa5045b3a650bdba38) is now ready to be sponsored by a Committer.
@jaikiran Thank you for the detailed review! I've addressed all your feedback:
-
Http2Connection.java: Removed the
elseblock soclose(cause)→doTerminate()handles processed streams. This ensures connection closure cause and stream termination cause are consistent. -
GoAwayWithErrorTest.java:
- Using
URIBuilderwith server's actual address instead of "localhost" - Removed timeout from
goAwaySentLatch.await() - Removed entire
allOf().orTimeout().join()block - Replaced success counting with direct
assertEquals(200, response.statusCode())
Thank you @ehs208 for these updates. The changes look good to me. I'll run the latest changes in our CI to make sure this continues to pass.
On a related note, have you run tier2 locally with these changes? The GitHub actions job only run tier1 and networking tests aren't covered in that. It would be good to run tier2 on your local setup too - that way any future contributions in this area can be tested locally to verify that changes don't break unexpected existing tests.
@jaikiran I ran jdk_net tests locally. Results: 1047 passed, 1 failed.
The failed test is HTTPSetAuthenticatorTest (HTTP/1.1 Authenticator test) with SocketException: Connection reset, which appears to be a flaky test unrelated to this PR's HTTP/2 GOAWAY changes. All HTTP/2 related tests passed successfully.
/integrate
@ehs208 Your change (at version 566dfe235ae29ae3028c639d55b6c90886c298c5) is now ready to be sponsored by a Committer.
Hello @ehs208, I'm investigating a intermittent test failure of the newly added test in our CI. I think the test will need a few adjustments. I haven't had a chance to go over the failure in its entirety. I will add some details here when I am done looking at that failure.
@jaikiran Hello, Just checking in on the intermittent test failure investigation. No rush, just wanted to see if there's any update or if there's anything I can help with. Thanks
Hello @ehs208, I'm still looking into it (with help from Daniel too), but mostly it's blocked on me. I will need some more time to get back to this, please. It's still on my priority list and I should have something soon.