e2e-benchmarking Use maxConnection=-1 in router-perf test to increase tps and reduce error connections

@rsevilla87 Hi Raul, in this bug comment https://bugzilla.redhat.com/show_bug.cgi?id=1983751#c7, I verified that with configuring maxConnection=-1, we can reduce lot's of '0' response number, increase '200' response number, as well as increasing TPS and reduce latency. More test data and charts can be found here https://docs.google.com/spreadsheets/d/1jNYCdTu2XvSs4xARk8PwQGoPZVgra0jOQORIlUKAdKg/edit#gid=1789221797

Please let me know what you think about adding this configuration to router-perf test as an ENV var and set it to default.

Please notice that maxConnection=-1 will cause the router pod to consume more cpu and memory. In my test result above, I used
INFRA_NODE_INSTANCE_TYPE=m5.12xlarge (32x128) WORKLOAD_NODE_INSTANCE_TYPE=m5.8xlarge (48x192) I saw https://github.com/cloud-bulldozer/airflow-kubernetes/pull/190 tried to change Infrastructure nodes shifting from 48x192 to 16x64, that may need to be reevaluate if using this configuration.

Aug 09 '22 08:08 qiliRedHat

/cc @sjug

Aug 09 '22 13:08 jtaleric

HI @qiliRedHat we tested the new haproxy auto maxConn functionality with the network edge team and the decision was made to not set it as default for the reasons you've mentioned. It's not a config that we'd normally want to test on a regular basis. There should be/is a note added to the router docs for customers that want to have the extra throughput at the cost of higher resource consumption.

Aug 09 '22 14:08 sjug

Here we have two things

Identify number of clients (LARGE_SCALE_CLIENTS) according to TERMINATIONS
From results.csv file, use only 200 status requests for latency calculation

We should reduce number of router connections in our e2e mb config file to 20K(using 500 routes 40 clients) for http and passthrough and 10k (using 500 routes20 clients) for edge and re-encrypt.

In the comment https://bugzilla.redhat.com/show_bug.cgi?id=1983751#c7 we can see more number of non-200 responses compared to 200 responses with 80 clients. Unfortunately we are counting these non-200 response time in latency calculation.

We, in our local testing, also tuned haproxy "maxconn" by adding

oc set env -n openshift-ingress deployment router-default ROUTER_MAX_CONNECTIONS=80000

in our e2e script in configure_ingress_images() at https://github.com/cloud-bulldozer/e2e-benchmarking/blob/master/workloads/router-perf-v2/common.sh#L47

Below table is our results with 20k, 80k, 120K haproxy "maxconn" on rosa and self-managed aws clusters. We used 500 routes, 80 clients, "edge" termination and 50 keep-alive-requests in this testing for "60" seconds mb duration.

Env	Tuning (ROUTER_MAX_CONNECTIONS i.e haproxy "maxconn")	rps	99pctl_latency (sec)	Status 0 (num of req in K)	Status 200 (num of req in K)
ROSA	20K(default)	954	1659554399	210	57
ROSA	20K(default)	942	1659555059	188	56
ROSA	80K	20970	24	7	1258
ROSA	80K	14980	23	7	898
ROSA	120K	14901	23	4.8	894
ROSA	120K	1119	1659603597	196	67
self-managed	20K(default)	420	18	376	25
self-managed	20K(default)	1980	11	409	118

With 80k haproxy "maxconn", we reached 1258K succesful "200" status requests. And the total requests were 1265 (i.e 1258+7). Requests per second in this case is 20970.

With 20K haproxy "maxconn", total requests were only 525K (409K+118K). Number of non 200 status requests (409K) were 4 times higher than 200 status requests (118K).

As we go with default haproy "maxconn" i.e 20K, we should reduce number of router connections in our e2e mb config file to

20K(using 500 routes *40 clients) for http and passthrough
and 10K (using 500 routes*20 clients) for edge and re-encrypt as these termination modes will use 2x more connections. to get higher number of 200 status requests.

Non 200 status failures are happening at 2 stages

before connection established i.e "socket_write_request()" errors
after connection established i.e "socket_read(): connection" errors i.e from the result file 1659603596237517,6787302,0,0,0,GET https://http-perf-edge-376-http-scale-edge.apps.perf-411-aa3d.i9nd.s1.devshift.org:443/1024.html,4,33909,9,0,1659603596237517,6786128,0,0,socket_write_request() 1659603587079435,11071037,200,157,1383,GET https://http-perf-edge-128-http-scale-edge.apps.perf-411-aa3d.i9nd.s1.devshift.org:443/1024.html,0,21074,2,3,1659603552561460,10998685,14395053,0, 0,1659603603024829,0,0,0,GET https://http-perf-edge-128-http-scale-edge.apps.perf-411-aa3d.i9nd.s1.devshift.org:443/1024.html,0,21074,2,3,1659603552561460,10998685,14395053,0,socket_read(): connection 1659603593482628,4668064,0,157,0,GET https://http-perf-edge-186-http-scale-edge.apps.perf-411-aa3d.i9nd.s1.devshift.org:443/1024.html,1,22858,1,4,1659603547015225,5681502,13701044,0,socket_read(): connection

So including time taken for these non-200 status requests in latency calculation is not ideal as errors happening at different stages. Our result.csv parser in e2e should be enhanced to use only 200 status requests for latency calculation.

Aug 09 '22 17:08 venkataanil

@venkataanil Thanks for sharing the Non 200 status failures stages.

I also noticed the way of calculating latency. The latency result is not accurate especially when non-200 response number is relatively high. I think ideal way could be calculating latency of latency separately for each type of 'result_codes'. https://github.com/cloud-bulldozer/e2e-benchmarking/blob/f3991ff2595f1571cc49f54598ee48c108da68b2/workloads/router-perf-v2/workload.py#L62-L71

And the way of calculating TPS(rps) is the number of 200 response code /runtime, that means if the non-200 number is relatively big, the TPS(rps) will be smaller. That's another reason to avoid 200 response code. https://github.com/cloud-bulldozer/e2e-benchmarking/blob/f3991ff2595f1571cc49f54598ee48c108da68b2/workloads/router-perf-v2/workload.py#L105

Aug 10 '22 03:08 qiliRedHat

HI @qiliRedHat we tested the new haproxy auto maxConn functionality with the network edge team and the decision was made to not set it as default for the reasons you've mentioned. It's not a config that we'd normally want to test on a regular basis. There should be/is a note added to the router docs for customers that want to have the extra throughput at the cost of higher resource consumption.

Thanks for sharing this, I will pay attention to the doc. Now I only saw in 4.11 doc there is a new configuration of maxConnections in tuningOptions https://docs.openshift.com/container-platform/4.11/networking/ingress-operator.html#nw-ingress-controller-configuration-parameters_configuring-ingress

Aug 10 '22 03:08 qiliRedHat

@qiliRedHat thanks for reporting, the latency problem has been patched already - https://github.com/cloud-bulldozer/e2e-benchmarking/pull/453

Aug 10 '22 16:08 mukrishn

e2e-benchmarking e2e-benchmarking copied to clipboard

Use maxConnection=-1 in router-perf test to increase tps and reduce error connections

e2e-benchmarking
e2e-benchmarking copied to clipboard