Poor performance when stress testing
What is the current bug behavior?
I have written a very small "hello world" HTTP server using Hyper. When trying to stress test it using hurl (500 parallel jobs, 1 million requests), it averages around 3000 req/s which is incredibly slow. But when I use oha, I get 120k req/s.
Steps to reproduce
cat <<EOF | hurl --test --repeat 1000000 --jobs 500
GET http://localhost:8080
EOF
What is the expected correct behavior?
I'm not sure what to expect. Is hurl even the right tool for stress testing?
Execution context
- OS: Ubuntu 25.04
- Hurl Version (
hurl --version):
hurl 6.1.1 (unknown) libcurl/8.12.1-DEV OpenSSL/3.0.13 zlib/1.3
Features (libcurl): alt-svc AsynchDNS HSTS IPv6 libz SSL UnixSockets
Features (built-in): brotli
Possible fixes
Absolutely no idea.
Hi @linkdd
We've not specifically worked on performance so I'm not really surprised that Hurl has "poor" performances for this usage. That said, the Hurl parallel runner has a different model from oha (Hurl uses multithread with one process, oha seems to use aync) so that is a difference.
Something that can have an impact also is the terminal output: with --test, Hurl is going to display a lot of progress to the terminal so it may be a blocking point. There is no --no-progress-baroption and, without --test you'll not get the number of request/s sent.
We can keep this issue opened and see how we can improve things. Would you be kind to share:
- the hyper server sample ecode
- the Hurl command line
- the oha command line
We'll use it as a base to improve things.
Thanks!
Sample Code
https://gist.github.com/linkdd/851f142bc9d6ceb18fb4e1480fdee336
Hurl test case
cat <<EOF | hurl --test --parallel --repeat 1000000 --jobs 500
GET http://localhost:8080
EOF
==>
--------------------------------------------------------------------------------
Executed files: 1000000
Executed requests: 1000000 (3061.4/s)
Succeeded files: 1000000 (100.0%)
Failed files: 0 (0.0%)
Duration: 326653 ms
Oha test case
oha -n 1000000 -c 500 http://localhost:8080
==>
Summary:
Success rate: 100.00%
Total: 8.3379 secs
Slowest: 0.0613 secs
Fastest: 0.0000 secs
Average: 0.0042 secs
Requests/sec: 119934.3614
Total data: 10.49 MiB
Size/request: 11 B
Size/sec: 1.26 MiB
NB: There is no urgency on my side. I still use hurl (love it btw) for the test suite, and I found oha for the stress testing.
I just thought it was worth reporting. But I'm not even sure that Hurl aims to be a stress testing tool.
Thanks a lot for feedback, we'll try to improve things.
Hi @linkdd just revisiting this issue and giving it a fresh eye.
The way Hurl is coded is that there will be one libcurl handle by Hurl file (~ 1 HTTP connection by file). Launching 100,000 Hurl files with only one request is not exactly what've designed Hurl for. On a MBP M1, launching 1000,000 file in parallel against the hyper sample can even trigger some TCP ephemeral port exhaustion (because of the pressure on the TCP port resources)
To be more "fair" and give a good comparison against oha, I've created a Hurl file with 10,000 requests and run 10 times in parallel:
Hurl file:
GET http://localhost:8080
[Options]
repeat: 10000
Command :
$ hurl --jobs 10 --repeat 10 --test test_10000.hurl
Success test_10000.hurl (10000 request(s) in 3105 ms)
Success test_10000.hurl (10000 request(s) in 3146 ms)
Success test_10000.hurl (10000 request(s) in 3140 ms)
Success test_10000.hurl (10000 request(s) in 3150 ms)
Success test_10000.hurl (10000 request(s) in 3161 ms)
Success test_10000.hurl (10000 request(s) in 3145 ms)
Success test_10000.hurl (10000 request(s) in 3151 ms)
Success test_10000.hurl (10000 request(s) in 3155 ms)
Success test_10000.hurl (10000 request(s) in 3162 ms)
Success test_10000.hurl (10000 request(s) in 3174 ms)
--------------------------------------------------------------------------------
Executed files: 10
Executed requests: 100000 (30376.7/s)
Succeeded files: 10 (100.0%)
Failed files: 0 (0.0%)
Duration: 3292 ms
(--jobs 10 correspond to the number of parallel worker, no need to use more)
With oha on the same machine:
$ oha -n 100000 http://localhost:8080
Summary:
Success rate: 100.00%
Total: 1702.8641 ms
Slowest: 8.3753 ms
Fastest: 0.2058 ms
Average: 0.8488 ms
Requests/sec: 58724.5929
Total data: 1.05 MiB
Size/request: 11 B
Size/sec: 630.83 KiB
That's ~30,000 req/s for Hurl and ~60,000req/s for oha.
Given that Hurl has not been designed as a pure stress tools, it's not that bad. That said, it would be cool if you could run this test on your end to get more numbers and no matter the result we'll try to improve Hurl performance.
When running Hurl with samply in this configuration:
Hurl file:
GET http://localhost:8080
[Options]
repeat: 10000
$ samply record hurl --test --repeat 10 --jobs 10 test_10000.hurl
We've got this flame graph:
20% of the whole run is just for parsing the URL. I've made a small modifications to deal with the happy path (URL that starts with http:// / https://) and now we're around 52,000 req/s (oha is around 60,000 req/s on the same machine)
$ hurl --test --repeat 10 --jobs 10 test_10000.hurl
Success test_10000.hurl (10000 request(s) in 1889 ms)
Success test_10000.hurl (10000 request(s) in 1890 ms)
Success test_10000.hurl (10000 request(s) in 1890 ms)
Success test_10000.hurl (10000 request(s) in 1891 ms)
Success test_10000.hurl (10000 request(s) in 1892 ms)
Success test_10000.hurl (10000 request(s) in 1892 ms)
Success test_10000.hurl (10000 request(s) in 1893 ms)
Success test_10000.hurl (10000 request(s) in 1893 ms)
Success test_10000.hurl (10000 request(s) in 1894 ms)
Success test_10000.hurl (10000 request(s) in 1895 ms)
--------------------------------------------------------------------------------
Executed files: 10
Executed requests: 100000 (52687.0/s)
Succeeded files: 10 (100.0%)
Failed files: 0 (0.0%)
Duration: 1898 ms
The flame graph is now:
We can see that the time is mainly spent in curl HTTP code. To go further, we would need maybe to use curl multi handle but we don't want to add too complexity to the current code.
Server (code of the gist):
Thanks for the issue @linkdd we've made small improvements to the perf stress use case. Will try to get better in the future
@jcamiel Thanks! I can confirm the performance gain, that's awesome :) Thank you for the great work.