Hyperfoil
Hyperfoil copied to clipboard
Pipelining discrepancies compared with wrk
When running with pipelining, I see big throughput differences between wrk
and a yaml that replicates wrk.
With wrk
:
/home/g/opt/wrk/wrk --latency -d 15 -c 256 --timeout 8 -t 3 http://192.168.1.163:8080/hello -s ../pipeline.lua -- 16
Running 15s test @ http://192.168.1.163:8080/hello
3 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 28.47ms 15.99ms 127.90ms 64.22%
Req/Sec 27.17k 1.55k 35.68k 77.33%
Latency Distribution
50% 27.40ms
75% 39.53ms
90% 50.57ms
99% 64.21ms
1216848 requests in 15.05s, 96.32MB read
Requests/sec: 80877.09
Transfer/sec: 6.40MB
With a homegrown yaml based on the yaml in the documentation:
[hyperfoil@in-vm]$ run plaintext-wrk -PSERVER=192.168.1.163 -PDURATION=15 -PTHREADS=3 -PTIMEOUT=8 -PPIPELINE=16 -PCONNECTIONS=256
Started run 0003
Run 0003, benchmark plaintext-wrk
Agents: in-vm[STOPPED]
Started: 2023/12/04 10:38:55.427 Terminated: 2023/12/04 10:39:16.441
NAME STATUS STARTED REMAINING COMPLETED TOTAL DURATION DESCRIPTION
calibration TERMINATED 10:38:55.427 10:39:01.434 6007 ms (exceeded by 7 ms) 256 users always
test TERMINATED 10:39:01.434 10:39:16.441 15007 ms (exceeded by 7 ms) 256 users always
[hyperfoil@in-vm]$ stats
Total stats from run 0003
PHASE METRIC THROUGHPUT REQUESTS MEAN p50 p90 p99 p99.9 p99.99 TIMEOUTS ERRORS BLOCKED 2xx 3xx 4xx 5xx CACHE
calibration request 46.00k req/s 276326 5.51 ms 4.88 ms 11.21 ms 13.43 ms 16.25 ms 21.63 ms 0 0 0 ns 276326 0 0 0 0
test request 46.09k req/s 691686 5.50 ms 4.85 ms 11.21 ms 13.30 ms 15.53 ms 19.79 ms 0 0 0 ns 691686 0 0 0 0
The yaml looks like this:
name: plaintext-wrk
threads: !param THREADS 2 # option -t
http:
host: !concat [ "http://", !param SERVER localhost, ":8080" ]
allowHttp2: false
pipeliningLimit: !param PIPELINE 1
sharedConnections: !param CONNECTIONS 10 # option -c
ergonomics:
repeatCookies: false
userAgentFromSession: false
phases:
- calibration:
always:
users: !param CONNECTIONS 10 # option -c
duration: 6s
maxDuration: 70s # This is duration + default timeout 60s
scenario: &scenario
- request:
- httpRequest:
GET: /hello
timeout: !param TIMEOUT 60s # option --timeout
headers:
- accept: text/plain # option -H
handler:
# We'll check that the response was successful (status 200-299)
status:
range: 2xx
- test:
always:
users: !param CONNECTIONS 10 # option -c
duration: !param DURATION 10s # option -d
maxDuration: 70s # This is duration + default timeout 60s
startAfterStrict: calibration
scenario: *scenario
I can't compare with the wrk
in Hyperfoil because it doesn't support pipelining.
Without pipelining the difference is not there any more. wrk
results are:
/home/g/opt/wrk/wrk --latency -d 15 -c 256 --timeout 8 -t 3 http://192.168.1.163:8080/hello
Running 15s test @ http://192.168.1.163:8080/hello
3 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 7.14ms 6.15ms 37.64ms 78.02%
Req/Sec 14.40k 1.02k 17.89k 62.00%
Latency Distribution
50% 2.92ms
75% 10.93ms
90% 17.58ms
99% 23.17ms
644537 requests in 15.04s, 51.02MB read
Requests/sec: 42850.63
Transfer/sec: 3.39MB
And with the yaml above:
[hyperfoil@in-vm]$ run plaintext-wrk -PSERVER=192.168.1.163 -PDURATION=15 -PTHREADS=3 -PTIMEOUT=8 -PPIPELINE=1 -PCONNECTIONS=256
Started run 0002
Run 0002, benchmark plaintext-wrk
Agents: in-vm[STOPPED]
Started: 2023/12/04 10:36:37.261 Terminated: 2023/12/04 10:36:58.277
NAME STATUS STARTED REMAINING COMPLETED TOTAL DURATION DESCRIPTION
calibration TERMINATED 10:36:37.261 10:36:43.269 6008 ms (exceeded by 8 ms) 256 users always
test TERMINATED 10:36:43.269 10:36:58.277 15008 ms (exceeded by 8 ms) 256 users always
[hyperfoil@in-vm]$ stats
Total stats from run 0002
PHASE METRIC THROUGHPUT REQUESTS MEAN p50 p90 p99 p99.9 p99.99 TIMEOUTS ERRORS BLOCKED 2xx 3xx 4xx 5xx CACHE
calibration request 39.90k req/s 239699 6.20 ms 2.82 ms 18.74 ms 24.12 ms 28.05 ms 30.80 ms 0 0 0 ns 239699 0 0 0 0
test request 39.88k req/s 598583 6.39 ms 2.83 ms 19.66 ms 23.86 ms 27.26 ms 32.51 ms 0 0 0 ns 598583 0 0 0 0
Despite not being a bug per se, this is due to how hyperfoil implements pipelining - it is indeed able to sent more than one HTTP requests over the same connection, without waiting the completion(s) - but it is sending them one by one without batching i.e https://github.com/Hyperfoil/Hyperfoil/blob/80c2bfc2f483ee2baf53ea26cbbe830305310462/http/src/main/java/io/hyperfoil/http/connection/Http1xConnection.java#L194-L199
Instead, it should perform the writes (without flushing) and flush them when a full batch is completed: the problem is that is not simple to be achieved.
For example if we simply implement a watermark-based mechanism on the connections which knows when the max pipelining level has been reached, if we have not enough available sessions for the "last" request, we loose the chance to flush all the previous ones in the batch. And we cannot even proceed in that, because no new sessions will be made available till the response for them is not received; leading to a starvation problem due to a circular dependency.
Even more naively: if the rate is 1 request/sec, we cannot just use a fixed watermark overflow before flushing a batch, or we risk to wait N (=== pipelining level) seconds before flushing the whole batch.
Doing this right requires some further investigation, and TBH I'm not sure is it worthy our time, unless it's a trivial change: mostly because HTTP 1.1 pipelining, beside benchmarks is not really a thing.
BUT...if HTTP 2 behave the same way,in hyperfoil and by measuring it we observe that it requires to be fixed, we can use this chance to unify the approaches, and fix this one too.