testfx icon indicating copy to clipboard operation
testfx copied to clipboard

New Test Type - Performance Benchmarks

Open RobertBouillon opened this issue 1 year ago • 1 comments

Summary

Add performance benchmarks as a type of test which measures changes in performance.

Background and Motivation

Pass/Fail is an inadequate way to measure the quality of a system. Some quality requirements, such as API response time, must be met and maintained for every software release. Even where there are not strict quality requirements, it's good to know whether or not a change has significantly impacted the performance of a feature (positively or negatively).

  • It's common for application performance to degrade over time as we lack the tools to adequately measure the performance of software.
  • Proper benchmarking software requires specialized tools and knowledge
  • Performance problems may not be identified immediately, requiring forensic analysis of source control to determine the cause. We lack ways to "fail fast" with qualitative tests

Proposed Feature

  • A new test type which focuses on measuring performance
    • A tool to create a "baseline" against which to compare changes.
    • Baselines should be specific to an environment & version (i.e. tag or commit)
    • Support for different environments (local, dev, staging, qa)
    • Historical tracking - important for tracking performance creep that doesn't trigger alarms release-to-release.
    • Reporting
    • Measurements (benchmark results) which can be tracked in source control
    • Configurable benchmark-focused optimizations, such as warmup iterations
    • Resource tracking (memory, CPU usage, I/O)
    • Configurable thresholds for warnings & failures (static error %, or based on std. deviation)
    • Micro-Environment (VM) configuration (mono, server GC, workstation GC)
    • Macro-Environment (OS Host) configuration (linux, windows, remote deployments)
    • Configurable concurrency

Alternative Designs

  • Benchmark.net has implemented a lot of the heavy lifting for benchmarking, however it lacks a proper framework for automation. MSTest seems like the best way to bring this into the standard workflow.
  • Most profiling tools exist exclusively in the CI/CD pipeline. Same as unit tests, performance tests as part of the pipeline should function as a back-stop guarantee, however validation should have been performed by a developer first.
  • Most CI/CD tools measure performance at load / under stress. These are complex tests. Simple changes (e.g. unit tests, but for performance) would be a more pragmatic way of tracking performance changes over time.

RobertBouillon avatar Feb 14 '24 18:02 RobertBouillon

Thanks for the suggestion, we're discussing about it, we also feel some gap here and we're experimenting some solution down that line https://github.com/microsoft/testfx/blob/main/test/Performance/MSTest.Performance.Runner/Program.cs#L43

We will be back when we will have clearer plan.

cc: @pavelhorak @Evangelink

MarcoRossignoli avatar Mar 17 '24 16:03 MarcoRossignoli