tonic
tonic copied to clipboard
Client-side streaming early OK fails with `StatusCode.CANCELLED` from python client
Bug Report
Version
└── tonic v0.7.1
└── tonic-build v0.7.1
Platform
Darwin Mustafas-MacBook-Pro.local 20.5.0 Darwin Kernel Version 20.5.0: Sat May 8 05:10:33 PDT 2021; root:xnu-7195.121.3~9/RELEASE_X86_64 x86_64
Description
I implemented simple client-side streaming endpoint using tonic-rs
. It may return early if some condition occurs:
async fn record_route(
&self,
request: Request<Streaming<Point>>,
) -> Result<Response<RouteSummary>, Status> {
let mut stream = request.into_inner();
let mut summary = RouteSummary::default();
while let Some(point) = stream.next().await {
let point = point?;
summary.point_count += 1;
if point.latitude == 0 {
println!("Return early: {}", summary.point_count);
return Ok(Response::new(summary));
}
}
println!("Return normal: {}", summary.point_count);
Ok(Response::new(summary))
}
However, when I call this server from a simple python client it fails with the exception (StatusCode.CANCELLED
) below:
# python client.py 5000
port = 5000
early_return_index = 2
Sleep 500 ms
Sleep 500 ms
Sleep 500 ms
Traceback (most recent call last):
File "client.py", line 46, in <module>
main()
File "client.py", line 42, in main
call_stream(client)
File "client.py", line 34, in call_stream
print(client.RecordRoute(iter_points()))
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 1131, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.CANCELLED
details = "Received RST_STREAM with error code 8"
debug_error_string = "{"created":"@1651470233.217719000","description":"Error received from peer ipv6:[::1]:5000","file":"src/core/lib/surface/call.cc","file_line":1075,"grpc_message":"Received RST_STREAM with error code 8","grpc_status":1}"
>
GitHub repo to reproduce issue: https://github.com/ebraraktas/tonic-client-stream-early-ok
As the Issue section of README of the linked repo states, it runs ok with rust client. I know this may be related with grpc Python, but same server implemented in Python works smoothly if it is called from same Python client.
Yeah interesting, I believe the cancelled comes from the python code since the h2 connection is getting RST. Curious if you could reproduce this with a rust client?
Hi @LucioFranco , as I stated in my first message I can call rust server from rust client without an issue. I am not familiar with the internals of the tonic
, but can you explain why you think that the cancelled comes from python?
In addition, I implemented same server in go
and I could manage to call go server from the same python client without an issue. Interestingly, go client can call rust server with no issue as well. Here is a matrix describing the issue:
client\server | rust | go | python |
---|---|---|---|
rust | :white_check_mark: | :no_entry:* | :white_check_mark: |
go | :white_check_mark: | :white_check_mark: | :white_check_mark: |
python | :no_entry: | :white_check_mark: | :white_check_mark: |
*: My rust client gives error below when I try to connect to the go server:
Error: tonic::transport::Error(Transport, hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 61, kind: ConnectionRefused, message: "Connection refused" })))
So the cancelled is likely returned from the client when it encounters a GoAway/connection drop/request ended before getting the full trailers etc. The error is not passed from server -> client. As for why go works and python does not....that one I don't know I would maybe open an issue in the python implementation?
As for your last error there that is just a regular tcp error coming from the tcp connection in the underlying OS, so maybe you're not connecting on the right port etc.
@LucioFranco I opened an issue in the official gRPC
repository and they responded that they think the issue arises from gRPC
implementation of the tonic-rs
. The linked message has the details that I am not very familiar with, but I hope they may be helpful.
@LucioFranco are there any update on this issue?
Hi @ebraraktas I've started to look but have had a few other high priority items come up. I am about to head out on vacation so not sure how far I will get on this until I get back. This sounds like it will need some low level investigation to understand where the gap is.
Small update: I dug pretty deep into the issue on Friday afternoon (weekend hit before I had time to reply here). From what I can tell the behavior from the Rust side is interesting in that we send a RST_STREAM.Cancel even though we reply with a regular Ok status. My theory has to do with the fact that we drop the RecvStream before the sender side has half-closed it. Will keep looking into this and return with a better answer.
Once we get the next release out of h2 https://github.com/hyperium/h2/issues/633 this will fix this issue I believe.
I tried the repro repo with the patch, and indeed it should fix the issue.
/tmp/tonic-client-stream-early-ok/proto main 16s
venv ❯ python client.py
port = 5000
early_return_index = 3
Sleep 500 ms
Sleep 500 ms
Sleep 500 ms
Sleep 500 ms
point_count: 4
/tmp/tonic-client-stream-early-ok main 1m 25s
❯ cargo run --bin routeguide-server
Finished dev [unoptimized + debuginfo] target(s) in 0.05s
Running `target/debug/routeguide-server`
Return early: 4
At least no traceback
I confirm that it is fixed. Updated & archived the repro repo. Thanks for your efforts @erebe and @LucioFranco .