e2e-benchmarking
e2e-benchmarking copied to clipboard
new workload router-perf-v3
We run router-perf-v2
workload for data plane performance test and we need some enhancement to address a different behavior on managed-service OCP.
This workload creates pods and generates traffic within cluster (from hostnet
pod) and on AWS Openshift http traffics from client get routed to an external Loadbalancer
VIP and route back to the cluster.
But other platforms follow completely different approach, GCP/Azure route client traffics to their corresponding service clusterIP
using IPtables DNAT policy, so the client traffic will not exit out of a cluster and reach server.
Probably its worth spending some effort on a new workload router-perf-v3
(or add-on to v2) to run the client from an external source to replicate real-world scenario, however the consistency of results are affected due to known external variable(LB, client resources, client location, cloud variability), this way it follows same behavior on all platform and easier to compare results between them.
Recently, we have been noticing inconsistency in the results on running router-perf-v2 on ROSA/AWS. Its high time to redesign this or add a new HTTP benchmark tool, slack convo
The inconsistency in latency is due to an issue in mb tool reporting logic, it is recording a wrong value when the response is non-200 with socket_read(): connection
error, latency calculation is delta between request and response timestamp but during this case it is calculating with wrong timestamp(0) and it is affecting the P99, P90 and avg latency.