Hyper unexpectedly way slower when using HTTP/2
Version
hyper 0.14.23
h2 0.3.15
tokio 1.22.0
Platform
Linux polonium 6.0.9-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 16 Nov 2022 17:01:17 +0000 x86_64 GNU/Linux
Description
Given the following code:
use hyper::{Client, Server, Body, Response, Request};
use hyper::service::{make_service_fn, service_fn};
use std::env::args;
#[tokio::main]
async fn main() {
let mut args = args();
args.next();
let http2_only = args.next() == Some("http2".into());
tokio::spawn(async {
let make_svc = make_service_fn(move |_| {
async move {
Ok::<_, hyper::Error>(service_fn(move |_req| {
async move {
Ok::<_, hyper::Error>(Response::new(Body::from(vec![])))
}
}))
}
});
let addr = ([127, 0, 0, 1], 3000).into();
Server::bind(&addr).serve(make_svc).await
});
tokio::spawn(async move {
let client = Client::builder()
.http2_only(http2_only)
.build_http();
for _ in 0..100_000 {
let request = Request::builder().method("POST").uri("http://127.0.0.1:3000/foo").body(Body::from(vec![1, 2, 3])).unwrap();
let _res = client.request(request).await.unwrap();
}
}).await.unwrap();
}
I'm getting the following:
[eijebong@polonium] ~/code/hyper-slow (main) >>> time ./target/release/hyper-slow
./target/release/hyper-slow 0.93s user 2.55s system 99% cpu 3.477 total
[eijebong@polonium] ~/code/hyper-slow (main) >>> time ./target/release/hyper-slow http2
./target/release/hyper-slow http2 6.34s user 5.61s system 18% cpu 1:04.53 total
As you can see, if the client is set to do HTTP/2 only, the whole thing is around 20 times slower.
Now, if I remove the tokio::spawn around the client async block, I get ./target/release/hyper-slow http2 3.90s user 4.77s system 111% cpu 7.804 total which is still twice as slow as the HTTP/1 implementation.
Interesting. We do know HTTP/2 can be slower, depending on the workflow. There's room for optimization. But when I run the exact code you pasted, I only see 2x slower (2s to 4s).
I also see a latencies that are 2-3x higher when using HTTP/2 over HTTP/1.
hyper: 0.14.23 tokio: 1.23.0 os: Linux Mint 20.2
only http2: real 0.075s, user 0.023s, sys 0.022s both: real 0.010s, user 0.004s, sys 0.007s
We have a benchmark suite comparing HTTP/1 and HTTP/2 end-to-end in hyper in benches/end_to_end.rs, with a variety of concurrency and body sizes. It does show (and has for a while) that HTTP/2 can be slower. (They're disabled in master, since they need to be updated to the new split-per-version connection types, but the 0.14.x branch still works well.)
The hope is that work done in h2 could be easily compared by checking those benchmarks.
If someone wanted to setup some profiling, and/or produce a flamegraph, of hyper with HTTP/2, that might be a good start to identifying low-hanging fruit to improve performance.
Sorry to add noise here, but I'm getting the same slowness on HTTP/2, and it's quite confusing. These are repeated requests to a default hyper server on localhost:7609 that just returns a plaintext Body with 10 bytes, using the same hyper::Client. And with TCP_NODELAY set.
http/1.1 time: 363.904µs
http/1.1 time: 228.934µs
http/1.1 time: 172.873µs
http/1.1 time: 196.49µs
http/1.1 time: 163.087µs
http/1.1 time: 140.841µs
http/1.1 time: 132.62µs
http/1.1 time: 141.813µs
http/1.1 time: 148.813µs
http/1.1 time: 169.31µs
http/2 time: 304.376µs
http/2 time: 41.595561ms
http/2 time: 43.943476ms
http/2 time: 44.008445ms
http/2 time: 43.993467ms
http/2 time: 289.768µs
http/2 time: 43.716004ms
http/2 time: 43.98801ms
http/2 time: 280.019µs
http/2 time: 43.706013ms
Most of the time it just takes <300 us, but the HTTP/2 server/client combo (with http2_only(true)) unexpectedly takes 40 milliseconds sometimes, and I'm not sure why.
I can reproduce this on multiple AWS EC2 Ubuntu 20.04 instances. But I can't reproduce it on my ARM-based macOS laptop, nor can I reproduce it on Docker Desktop for macOS.
Not sure how to create a flamegraph but will do my best to look into it more.
similar performance problem observed where even under low load, the h2 connection has tail latencies of > 40ms, but switching to h1 results in <1ms tail latencies. Since this seems to be related to the internals of the h2 crate, is there any guidance you could provide on how to debug this? I wouldn't mind trying to root cause this in the h2 crate.
This was on hyper 1.0.1 - i've measured timings in different parts of the hyper stack and the latency is purely from reading the request body via the collect() method on the body.
We are also observing 2x increase in latencies when we enabled the ALPN that allows using of h2. The load is quite low but the latencies are always higher. In artificial test I can confirm this also - seeing roughly 2x increase.
I wonder if that can be mitigated somehow, we'd like to use h2 anyway at some point to reduce number of connections. Is any work is happening on this? Can hyper 1.x help?
I can reproduce this on multiple AWS EC2 Ubuntu 20.04 instances. But I can't reproduce it on my ARM-based macOS laptop, nor can I reproduce it on Docker Desktop for macOS
That's weird. I was able to reproduce the issue on my ARM-based Macbook Pro with M1 Pro. I see an increase from ~2.5s to ~3.5s when benchmarking with the code in the issue description.