Latency under load
The benchmark in https://github.com/mirage/ocaml-cohttp/issues/328 seems unrealistic to me since the request handler responds immediately. Usually, some database requests are performed to build the response. So, I have investigated what happens when a Lwt.yield is added inside the request handler.
The code is here: https://gist.github.com/vouillon/5002fd0a8c33eb0634fb08de6741cec0
I'm using the following command to perform the benchmark. Compared to https://github.com/mirage/ocaml-cohttp/issues/328, I had to significantly raise the request rate to overwhelm the web servers.
wrk2 -t8 -c10000 -d60S --timeout 2000 -R 600000 --latency -H 'Connection: keep-alive' http://localhost:8080/
Cohttp is significantly slower than http/af, as expected. But http/af seems to exhibit some queueing as well, with a median latency of almost 10 seconds.

So, I'm wondering whether I'm doing anything wrong. Or maybe this is just something one should expect, since there is no longer any backpressure to limit the number of concurrent requests being processed?