service-mesh-performance icon indicating copy to clipboard operation
service-mesh-performance copied to clipboard

[Feature] SMP: Distributed Load Testing Capability with Nighthawk

Open leecalcote opened this issue 1 year ago β€’ 1 comments

Prologue

Many performance benchmarks are limited to single instance load generation (single pod load generator). This limits the amount of traffic that can be generated to the output of the single machine that the benchmark tool runs on in or out of a cluster. Overcoming this limitation would allow for more flexible and robust testing. Distributed load testing in parallel poses a challenge when merging results without losing the precision we need to gain insight into the high tail percentiles. Distributed load testing offers insight into system behaviors that arguably more accurately represent real-world behaviors of services under load as that load comes from any number of sources.

Current Behavior

The goal of an adaptive load controller is to determine the maximum load a system can sustain.

The maximum load is usually defined by the maximum requests per second (rps) the system can handle. The metrics (CPU usage, latency etc) collected from the system under test are the constraints we provide to judge whether a system under test (SUT) is sustaining the load.

A use-case that fits very well is be the ability to use it to run performance tests on a schedule and track the maximum load a system can handle over time. This could give insights to performance improvements or degradations.

Desired Behavior

  1. Add ability to horizontally scale Nighthawk and use Nighthawk’s proposed execution forwarding service and results sink. See design specification
  2. Integration of Nighthawk with relevant cloud native performance testing scenarios. Capability to invoke Nighthawk for distributed load testing through APIs and command-line interfaces. Implementation of reporting mechanisms for distributed load testing results.

Prerequisites

  1. Nighthawk as an Externalized Meshery Component

Future Goals:

  1. Add adaptive load control capability to Meshery server.
  2. Make this functionality extensible through plugins and extensions.

Acceptance Tests

  1. Incorporation of distributed load analysis into the ongoing tests and performance reports.

Mockups


Contributor Guides and Resources

leecalcote avatar Jan 29 '24 21:01 leecalcote

Hi @leecalcote πŸ‘‹, I hope you are doing well!

In the past few days, I have been diving deep into the project and have been delving into the getnighthawk repo, meshery-nightwalk repo, and the Nighthawk repo to understand their design architectures, components, and lifecycles.

I intend to contribute to this project during Term 1 of LFX, and I am currently working on its proposal.

I have a decent knowledge of building microservices with Go. Are there any other areas I should focus on learning to add more value to this project? Any prerequisite tasks or tips from your side would be very helpful.

Thank you for your time! 😊

Aankirz avatar Feb 05 '24 06:02 Aankirz