h2 icon indicating copy to clipboard operation
h2 copied to clipboard

Http2 protocol violation

Open erebe opened this issue 1 year ago • 6 comments

Hello,

I am reporting an issue I have encountered while using tonic grpc server (v0.8.0). We use nginx as a GRPC reverse proxy and hit the following corner case. Full details here: https://trac.nginx.org/nginx/ticket/2376 you can also find a pcap capture of the issue in the ticket

With this setup [client] === grpc stream (with ssl) ===> [nginx] === grpc stream (cleartext) ===> [backend]

and considering the following grpc service description

service Agent {
     rpc StreamToServer(stream Empty {}) returns (Empty) {}
}

If the http2 server generated by tonic early respond to the client without consuming fully the input stream

impl Agent {
    async fn stream_to_server (
        &self,
        request: Request<Streaming<Empty>>,
    ) -> Result<Response<Empty>, Status> {
        let _ = request.into_inner();
        // we don't care of the request at all
        // Just sending back our response before end of client stream
        Ok(Response::new(Empty {}))
    }
}

The following packets are going to be emitted by tonic/hyper/h2, in response to the call.

HEADERS[1],  DATA[1] (GRPC empty data),  HEADERS[1] (TRAILING with flag end_stream) , RST_STREAM(error: CANCEL)[1]

This specific sequence of packet is causing nginx to miss-behave and not forward the DATA & RST_STREAM back to the client. After discussion with a maintainer of nginx, this is caused by the last RST_STREAM(error: CANCEL) which is invalid in regard of the spec, it should be a RST_STREAM(NO_ERROR)

As per the RFC

   ... A server can
   send a complete response prior to the client sending an entire
   request if the response does not depend on any portion of the request
   that has not been sent and received.  When this is true, a server MAY
   request that the client abort transmission of a request without error
   by sending a RST_STREAM with an error code of NO_ERROR after sending
   a complete response (i.e., a frame with the END_STREAM flag).
   Clients MUST NOT discard responses as a result of receiving such a
   RST_STREAM, though clients can always discard responses at their
   discretion for other reasons.

I tracked down where is coming this RST_STREAM(CANCEL), at first I thought it was in tonic but it ended-up being in the impl Drop of h2

h2::proto::streams::streams::maybe_cancel streams.rs:1467
h2::proto::streams::streams::drop_stream_ref::{closure#0} streams.rs:1443
h2::proto::streams::counts::Counts::transition<h2::proto::streams::streams::drop_stream_ref::{closure_env#0}, ()> counts.rs:127
h2::proto::streams::streams::drop_stream_ref streams.rs:1442
h2::proto::streams::streams::{impl#12}::drop streams.rs:1403

https://github.com/hyperium/h2/blob/756384f4cdd62bce3af7aa53a156ba2cc557b5ec/src/proto/streams/streams.rs#L1466

So it seems a generic way of handling the end of stream for h2. And now I am stuck in my investigation and require some guidance regarding where to go from here.

  • Replacing the Reason::CANCEL by Reason::NO_ERROR solves my issue with nginx (tested locally) and seem more adequate in regard of the RFC, but I don't know if there is un-expected side effect to that for other cases
  • I don't know if it is tonic or hyper that miss use h2 and require somewhere some special handling for this scenario

We really like tonic and nginx and would appreciate if we can go forward to make both happy to work together

erebe avatar Aug 12 '22 19:08 erebe

Yes, I believe we've discussed before that there are cases that this should send NO_ERROR instead of CANCEL. Probably only in this specific case:

The full response has been sent (or scheduled to send), and any option DATA and trailers. Basically, that an end-of-stream frame is going out.

I think CANCEL is still correct in the general case, where the user has dropped all handles but never triggered end-of-stream, and also if the client drops handle waiting for the response.

seanmonstar avatar Aug 12 '22 20:08 seanmonstar

Hello, thanks for the answer !

So I am at the right place :fireworks: Is changing the maybe_cancel code by something like this would feel right to you ? Or is there some other cases to handle ?

fn maybe_cancel(stream: &mut store::Ptr, actions: &mut Actions, counts: &mut Counts) {
    if stream.is_canceled_interest() {
+        // To match HalfClosedLocal(Streaming) state
+       let reason = if stream.state.is_send_closed() && stream.state.is_recv_streaming() {
+            Reason::NO_ERROR
+        } else {
+           Reason::CANCEL
+        };

        actions
            .send
            .schedule_implicit_reset(stream, reason, counts, &mut actions.task);
        actions.recv.enqueue_reset_expiration(stream, counts);
    }
}

erebe avatar Aug 12 '22 21:08 erebe

Ummmm, hard for me to remember. I imagine if you changed that function without the conditional, some unit test would complain? That might show which unit test needs to be updated, or maybe we dont have this specific case as a test...

seanmonstar avatar Aug 12 '22 21:08 seanmonstar

Ok, so I am impressed by the quality of your tests ... Would dream to have those mock in my projects.

Regarding the issue, by adding an is_server() check in the condition and adapting 1 test that seems to match my case. Everything is green with a cargo test --all-features --all

diff --git a/src/proto/streams/streams.rs b/src/proto/streams/streams.rs
index f11ee08..bfce840 100644
--- a/src/proto/streams/streams.rs
+++ b/src/proto/streams/streams.rs
@@ -1461,9 +1461,16 @@ fn drop_stream_ref(inner: &Mutex<Inner>, key: store::Key) {

 fn maybe_cancel(stream: &mut store::Ptr, actions: &mut Actions, counts: &mut Counts) {
     if stream.is_canceled_interest() {
+        // To match HalfClosedLocal(Streaming) state
+        let reason = if counts.peer().is_server() && stream.state.is_send_closed() && stream.state.is_recv_streaming() {
+            Reason::NO_ERROR
+        } else {
+            Reason::CANCEL
+        };
+
         actions
             .send
-            .schedule_implicit_reset(stream, Reason::CANCEL, counts, &mut actions.task);
+            .schedule_implicit_reset(stream, reason, counts, &mut actions.task);
         actions.recv.enqueue_reset_expiration(stream, counts);
     }
 }
diff --git a/tests/h2-tests/tests/server.rs b/tests/h2-tests/tests/server.rs
index b3bf1a2..e61fd3b 100644
--- a/tests/h2-tests/tests/server.rs
+++ b/tests/h2-tests/tests/server.rs
@@ -566,7 +566,7 @@ async fn sends_reset_cancel_when_req_body_is_dropped() {
         client
             .recv_frame(frames::headers(1).response(200).eos())
             .await;
-        client.recv_frame(frames::reset(1).cancel()).await;
+        client.recv_frame(frames::reset(1).reason(Reason::NO_ERROR)).await;
     };

     let srv = async move {

If this seems good to you, I can go forward and make a PR next week. If you require some other things before, let me know

erebe avatar Aug 12 '22 22:08 erebe

Nice, looks good to me!

seanmonstar avatar Aug 12 '22 23:08 seanmonstar

Here we are https://github.com/hyperium/h2/pull/634 :) It needs some approval to run the github actions

erebe avatar Aug 13 '22 07:08 erebe

Thank you for the kind words and your reactivity ^^

By any chances, do you plan to make a new release soonish ?

erebe avatar Aug 16 '22 06:08 erebe

This feels related to https://github.com/hyperium/tonic/issues/992, right?

LucioFranco avatar Aug 16 '22 16:08 LucioFranco

Hello @LucioFranco Most likely yes, I haven't tried with a python client, but the issue you describe is the one I had with nginx. Only RST_STREAM(NO_ERROR) is allowed on early server response.

I was planning to open an issue in tonic to ask to bump h2 version, once a new release is made, but seems you are ahead of me

erebe avatar Aug 16 '22 17:08 erebe

Thank you @seanmonstar and @LucioFranco , I am closing the issue as it is resolved and released :+1:

erebe avatar Aug 17 '22 19:08 erebe

Thanks again for the excellent work! If you're ever interested in helping on other aspects of h2, we'd welcome your work gladly! <3

seanmonstar avatar Aug 17 '22 20:08 seanmonstar