nighthawk icon indicating copy to clipboard operation
nighthawk copied to clipboard

Consider generalizing the adaptive load "testing stage"

Open eric846 opened this issue 5 years ago • 1 comments

Today we have to deal with the adjusting stage and testing stage having separate durations, and the fact that the traffic template is a full CommandLineOptions with a duration field that is ignored.

One basic idea I had was to make StepController::GetLatestCommandLineOptions() take a duration input, so the CommandLineOptions is completely ready to send. But in some sense that just pushes the weird treatment of duration to another place.

I'm not sure that the idea of a dedicated testing stage really needs to be baked into the framework. During the design review we talked about exploring the input space, rather than just searching for a single target RPS. After such an exploration, the adaptive load framework would force you to do a testing stage you aren't interested in, and you would discard the result.

More generic would be to have the StepController decide everything, with duration being no different from anything else in CommandLineOptions. Some StepControllers, like the basic one, could have a "search and then test" behavior where they run one longer benchmark at the end, but it would be the StepController deciding it. The main loop could just do whatever the StepController wants until it says it's time to exit, then not have any testing stage.

Concretely, this would mean:

  • Deleting the durations from the AdaptiveLoadSessionSpec proto
  • Adding duration settings to each StepController
  • Deleting the testing stage code that follows the main loop
  • Deleting the testing stage from the output proto

Duration is certainly an InputVariableSetter plugin we could trivially support, as suggested in https://github.com/envoyproxy/nighthawk/pull/535#pullrequestreview-488483737

eric846 avatar Sep 15 '20 20:09 eric846

Thanks for filing this and sharing your thoughts. My comments regarding the special treatment of duration had been more based on a mixture of gut feeling and in-review partial context, but your analysis puts it in full context which I think should help assess what would be for/against following up on eliminating that special treatment. I suspect that following up on this might also position this well for https://github.com/envoyproxy/nighthawk/issues/470

oschaaf avatar Sep 15 '20 21:09 oschaaf