tower-grpc
tower-grpc copied to clipboard
An executor based on grpc core completion queue
I'm testing tower-grpc with tokio::runtime::Runtime
as the backgroud executor of tower-grpc:
let mut runtime = RuntimeBuilder::new()
.name_prefix("server-grpc-runtime-")
.core_threads(2) // thread count.
.build()
.unwrap();
let mut http = Http::new();
let mut http2 = http.http2_only(true).clone();
#[allow(deprecated)]
http2.executor(runtime.executor());
I benchmark a bidi-stream call and record throughput on server side, for different core_threads
(1 and 2):
Result of 1 core thread:
messages per sec: 333444.48149383126
messages per sec: 325203.2520325203
messages per sec: 325626.831650928
messages per sec: 328947.36842105264
Result of 2 core thread:
messages per sec: 271591.5263443781
messages per sec: 274273.1760833791
messages per sec: 257201.64609053498
messages per sec: 263504.6113306983
We can see throughput of 2 core thread is worse than 1, even if the former utilities more CPU usage (180% vs 100%).
I guess with 2 core threads, context switch overhead is too heavy. However in grpc/grpc a call is bounded with a completion queue, and we can poll every completion queue with only 1 thread to avoid such problem. So is there any plan to implement a similar executor in tower-grpc (or tonic-grpc)?
@hicqu hey! So I am curious if the reduction comes down to the contention that may exist within h2
?
As of now, I don't think we have any plans for a completion queue type of h2 client/server but that would be interesting. I do think there is some improvements needed for h2 as well.
cc @seanmonstar @carllerche @hawkw