e2e-benchmarking
e2e-benchmarking copied to clipboard
Use maxConnection=-1 in router-perf test to increase tps and reduce error connections
@rsevilla87 Hi Raul, in this bug comment https://bugzilla.redhat.com/show_bug.cgi?id=1983751#c7, I verified that with configuring maxConnection=-1, we can reduce lot's of '0' response number, increase '200' response number, as well as increasing TPS and reduce latency. More test data and charts can be found here https://docs.google.com/spreadsheets/d/1jNYCdTu2XvSs4xARk8PwQGoPZVgra0jOQORIlUKAdKg/edit#gid=1789221797
Please let me know what you think about adding this configuration to router-perf test as an ENV var and set it to default.
Please notice that maxConnection=-1 will cause the router pod to consume more cpu and memory.
In my test result above, I used
INFRA_NODE_INSTANCE_TYPE=m5.12xlarge (32x128)
WORKLOAD_NODE_INSTANCE_TYPE=m5.8xlarge (48x192)
I saw https://github.com/cloud-bulldozer/airflow-kubernetes/pull/190 tried to change Infrastructure nodes shifting from 48x192 to 16x64, that may need to be reevaluate if using this configuration.
/cc @sjug
HI @qiliRedHat we tested the new haproxy auto maxConn functionality with the network edge team and the decision was made to not set it as default for the reasons you've mentioned. It's not a config that we'd normally want to test on a regular basis. There should be/is a note added to the router docs for customers that want to have the extra throughput at the cost of higher resource consumption.
Here we have two things
- Identify number of clients (LARGE_SCALE_CLIENTS) according to TERMINATIONS
- From results.csv file, use only 200 status requests for latency calculation
We should reduce number of router connections in our e2e mb config file to 20K(using 500 routes 40 clients) for http and passthrough and 10k (using 500 routes20 clients) for edge and re-encrypt.
In the comment https://bugzilla.redhat.com/show_bug.cgi?id=1983751#c7 we can see more number of non-200 responses compared to 200 responses with 80 clients. Unfortunately we are counting these non-200 response time in latency calculation.
We, in our local testing, also tuned haproxy "maxconn" by adding
oc set env -n openshift-ingress deployment router-default ROUTER_MAX_CONNECTIONS=80000
in our e2e script in configure_ingress_images() at https://github.com/cloud-bulldozer/e2e-benchmarking/blob/master/workloads/router-perf-v2/common.sh#L47
Below table is our results with 20k, 80k, 120K haproxy "maxconn" on rosa and self-managed aws clusters. We used 500 routes, 80 clients, "edge" termination and 50 keep-alive-requests in this testing for "60" seconds mb duration.
Env | Tuning (ROUTER_MAX_CONNECTIONS i.e haproxy "maxconn") | rps | 99pctl_latency (sec) | Status 0 (num of req in K) | Status 200 (num of req in K) |
---|---|---|---|---|---|
ROSA | 20K(default) | 954 | 1659554399 | 210 | 57 |
ROSA | 20K(default) | 942 | 1659555059 | 188 | 56 |
ROSA | 80K | 20970 | 24 | 7 | 1258 |
ROSA | 80K | 14980 | 23 | 7 | 898 |
ROSA | 120K | 14901 | 23 | 4.8 | 894 |
ROSA | 120K | 1119 | 1659603597 | 196 | 67 |
self-managed | 20K(default) | 420 | 18 | 376 | 25 |
self-managed | 20K(default) | 1980 | 11 | 409 | 118 |
With 80k haproxy "maxconn", we reached 1258K succesful "200" status requests. And the total requests were 1265 (i.e 1258+7). Requests per second in this case is 20970.
With 20K haproxy "maxconn", total requests were only 525K (409K+118K). Number of non 200 status requests (409K) were 4 times higher than 200 status requests (118K).
As we go with default haproy "maxconn" i.e 20K, we should reduce number of router connections in our e2e mb config file to
- 20K(using 500 routes *40 clients) for http and passthrough
- and 10K (using 500 routes*20 clients) for edge and re-encrypt as these termination modes will use 2x more connections. to get higher number of 200 status requests.
Non 200 status failures are happening at 2 stages
- before connection established i.e "socket_write_request()" errors
- after connection established i.e "socket_read(): connection" errors i.e from the result file 1659603596237517,6787302,0,0,0,GET https://http-perf-edge-376-http-scale-edge.apps.perf-411-aa3d.i9nd.s1.devshift.org:443/1024.html,4,33909,9,0,1659603596237517,6786128,0,0,socket_write_request() 1659603587079435,11071037,200,157,1383,GET https://http-perf-edge-128-http-scale-edge.apps.perf-411-aa3d.i9nd.s1.devshift.org:443/1024.html,0,21074,2,3,1659603552561460,10998685,14395053,0, 0,1659603603024829,0,0,0,GET https://http-perf-edge-128-http-scale-edge.apps.perf-411-aa3d.i9nd.s1.devshift.org:443/1024.html,0,21074,2,3,1659603552561460,10998685,14395053,0,socket_read(): connection 1659603593482628,4668064,0,157,0,GET https://http-perf-edge-186-http-scale-edge.apps.perf-411-aa3d.i9nd.s1.devshift.org:443/1024.html,1,22858,1,4,1659603547015225,5681502,13701044,0,socket_read(): connection
So including time taken for these non-200 status requests in latency calculation is not ideal as errors happening at different stages. Our result.csv parser in e2e should be enhanced to use only 200 status requests for latency calculation.
@venkataanil Thanks for sharing the Non 200 status failures stages.
I also noticed the way of calculating latency. The latency result is not accurate especially when non-200 response number is relatively high. I think ideal way could be calculating latency of latency separately for each type of 'result_codes'. https://github.com/cloud-bulldozer/e2e-benchmarking/blob/f3991ff2595f1571cc49f54598ee48c108da68b2/workloads/router-perf-v2/workload.py#L62-L71
And the way of calculating TPS(rps) is the number of 200 response code /runtime, that means if the non-200 number is relatively big, the TPS(rps) will be smaller. That's another reason to avoid 200 response code. https://github.com/cloud-bulldozer/e2e-benchmarking/blob/f3991ff2595f1571cc49f54598ee48c108da68b2/workloads/router-perf-v2/workload.py#L105
HI @qiliRedHat we tested the new haproxy auto maxConn functionality with the network edge team and the decision was made to not set it as default for the reasons you've mentioned. It's not a config that we'd normally want to test on a regular basis. There should be/is a note added to the router docs for customers that want to have the extra throughput at the cost of higher resource consumption.
Thanks for sharing this, I will pay attention to the doc. Now I only saw in 4.11 doc there is a new configuration of maxConnections in tuningOptions https://docs.openshift.com/container-platform/4.11/networking/ingress-operator.html#nw-ingress-controller-configuration-parameters_configuring-ingress
@qiliRedHat thanks for reporting, the latency problem has been patched already - https://github.com/cloud-bulldozer/e2e-benchmarking/pull/453