h2
h2 copied to clipboard
Http2 protocol violation
Hello,
I am reporting an issue I have encountered while using tonic grpc server (v0.8.0). We use nginx as a GRPC reverse proxy and hit the following corner case. Full details here: https://trac.nginx.org/nginx/ticket/2376 you can also find a pcap capture of the issue in the ticket
With this setup [client] === grpc stream (with ssl) ===> [nginx] === grpc stream (cleartext) ===> [backend]
and considering the following grpc service description
service Agent {
rpc StreamToServer(stream Empty {}) returns (Empty) {}
}
If the http2 server generated by tonic early respond to the client without consuming fully the input stream
impl Agent {
async fn stream_to_server (
&self,
request: Request<Streaming<Empty>>,
) -> Result<Response<Empty>, Status> {
let _ = request.into_inner();
// we don't care of the request at all
// Just sending back our response before end of client stream
Ok(Response::new(Empty {}))
}
}
The following packets are going to be emitted by tonic/hyper/h2, in response to the call.
HEADERS[1], DATA[1] (GRPC empty data), HEADERS[1] (TRAILING with flag end_stream) , RST_STREAM(error: CANCEL)[1]
This specific sequence of packet is causing nginx to miss-behave and not forward the DATA & RST_STREAM back to the client. After discussion with a maintainer of nginx, this is caused by the last RST_STREAM(error: CANCEL) which is invalid in regard of the spec, it should be a RST_STREAM(NO_ERROR)
As per the RFC
... A server can
send a complete response prior to the client sending an entire
request if the response does not depend on any portion of the request
that has not been sent and received. When this is true, a server MAY
request that the client abort transmission of a request without error
by sending a RST_STREAM with an error code of NO_ERROR after sending
a complete response (i.e., a frame with the END_STREAM flag).
Clients MUST NOT discard responses as a result of receiving such a
RST_STREAM, though clients can always discard responses at their
discretion for other reasons.
I tracked down where is coming this RST_STREAM(CANCEL), at first I thought it was in tonic but it ended-up being in the impl Drop
of h2
h2::proto::streams::streams::maybe_cancel streams.rs:1467
h2::proto::streams::streams::drop_stream_ref::{closure#0} streams.rs:1443
h2::proto::streams::counts::Counts::transition<h2::proto::streams::streams::drop_stream_ref::{closure_env#0}, ()> counts.rs:127
h2::proto::streams::streams::drop_stream_ref streams.rs:1442
h2::proto::streams::streams::{impl#12}::drop streams.rs:1403
https://github.com/hyperium/h2/blob/756384f4cdd62bce3af7aa53a156ba2cc557b5ec/src/proto/streams/streams.rs#L1466
So it seems a generic way of handling the end of stream for h2. And now I am stuck in my investigation and require some guidance regarding where to go from here.
- Replacing the Reason::CANCEL by Reason::NO_ERROR solves my issue with nginx (tested locally) and seem more adequate in regard of the RFC, but I don't know if there is un-expected side effect to that for other cases
- I don't know if it is tonic or hyper that miss use h2 and require somewhere some special handling for this scenario
We really like tonic and nginx and would appreciate if we can go forward to make both happy to work together
Yes, I believe we've discussed before that there are cases that this should send NO_ERROR
instead of CANCEL
. Probably only in this specific case:
The full response has been sent (or scheduled to send), and any option DATA and trailers. Basically, that an end-of-stream frame is going out.
I think CANCEL
is still correct in the general case, where the user has dropped all handles but never triggered end-of-stream, and also if the client drops handle waiting for the response.
Hello, thanks for the answer !
So I am at the right place :fireworks: Is changing the maybe_cancel code by something like this would feel right to you ? Or is there some other cases to handle ?
fn maybe_cancel(stream: &mut store::Ptr, actions: &mut Actions, counts: &mut Counts) {
if stream.is_canceled_interest() {
+ // To match HalfClosedLocal(Streaming) state
+ let reason = if stream.state.is_send_closed() && stream.state.is_recv_streaming() {
+ Reason::NO_ERROR
+ } else {
+ Reason::CANCEL
+ };
actions
.send
.schedule_implicit_reset(stream, reason, counts, &mut actions.task);
actions.recv.enqueue_reset_expiration(stream, counts);
}
}
Ummmm, hard for me to remember. I imagine if you changed that function without the conditional, some unit test would complain? That might show which unit test needs to be updated, or maybe we dont have this specific case as a test...
Ok, so I am impressed by the quality of your tests ... Would dream to have those mock in my projects.
Regarding the issue, by adding an is_server()
check in the condition and adapting 1 test that seems to match my case.
Everything is green with a cargo test --all-features --all
diff --git a/src/proto/streams/streams.rs b/src/proto/streams/streams.rs
index f11ee08..bfce840 100644
--- a/src/proto/streams/streams.rs
+++ b/src/proto/streams/streams.rs
@@ -1461,9 +1461,16 @@ fn drop_stream_ref(inner: &Mutex<Inner>, key: store::Key) {
fn maybe_cancel(stream: &mut store::Ptr, actions: &mut Actions, counts: &mut Counts) {
if stream.is_canceled_interest() {
+ // To match HalfClosedLocal(Streaming) state
+ let reason = if counts.peer().is_server() && stream.state.is_send_closed() && stream.state.is_recv_streaming() {
+ Reason::NO_ERROR
+ } else {
+ Reason::CANCEL
+ };
+
actions
.send
- .schedule_implicit_reset(stream, Reason::CANCEL, counts, &mut actions.task);
+ .schedule_implicit_reset(stream, reason, counts, &mut actions.task);
actions.recv.enqueue_reset_expiration(stream, counts);
}
}
diff --git a/tests/h2-tests/tests/server.rs b/tests/h2-tests/tests/server.rs
index b3bf1a2..e61fd3b 100644
--- a/tests/h2-tests/tests/server.rs
+++ b/tests/h2-tests/tests/server.rs
@@ -566,7 +566,7 @@ async fn sends_reset_cancel_when_req_body_is_dropped() {
client
.recv_frame(frames::headers(1).response(200).eos())
.await;
- client.recv_frame(frames::reset(1).cancel()).await;
+ client.recv_frame(frames::reset(1).reason(Reason::NO_ERROR)).await;
};
let srv = async move {
If this seems good to you, I can go forward and make a PR next week. If you require some other things before, let me know
Nice, looks good to me!
Here we are https://github.com/hyperium/h2/pull/634 :) It needs some approval to run the github actions
Thank you for the kind words and your reactivity ^^
By any chances, do you plan to make a new release soonish ?
This feels related to https://github.com/hyperium/tonic/issues/992, right?
Hello @LucioFranco Most likely yes, I haven't tried with a python client, but the issue you describe is the one I had with nginx. Only RST_STREAM(NO_ERROR) is allowed on early server response.
I was planning to open an issue in tonic to ask to bump h2 version, once a new release is made, but seems you are ahead of me
Thank you @seanmonstar and @LucioFranco , I am closing the issue as it is resolved and released :+1:
Thanks again for the excellent work! If you're ever interested in helping on other aspects of h2
, we'd welcome your work gladly! <3