Tracking success rate of benchmarked functions
I have a use case for tracking the performance and success rate of non-deterministic functions.
The following function serves to outline the scenario:
def foo():
time.sleep(base_time + abs(random.gauss(0, 0.01)))
if random.random() < error_rate:
raise RuntimeError
I have played around and arrived at the following result:
def benchmark_pedantic_with_count(benchmark, function, *args, **kwargs):
successes = []
@wraps(function)
def wrapper(*args, **kwargs):
try:
result = function(*args, **kwargs)
successes.append(True)
return result
except:
successes.append(False)
benchmark.pedantic(wrapper, *args, **kwargs)
benchmark.extra_info['success_count'] = sum(successes)
new_stats_fields = list(benchmark.stats.stats.fields)
new_stats_fields.append('succ')
benchmark.stats.stats.fields = new_stats_fields
benchmark.stats.stats.succ = sum(successes) / len(successes)
To get the new column succ actually displayed, I had to also:
- Add
succtopytest_benchmark.utils.ALLOWED_COLUMNS. - Overwrite
pytest_benchmark.table.displayso it showssucc.
(How exactly to achieve those two things is left an an exercise for the reader.)
While this does work, I am unsure if my solution could be upstreamed easily.
How should I do it if I want my solution to be merged into pytest-benchmark?
Alternate and related approaches:
- Add an argument to
benchmark.pedanticthat makes it continue on exceptions, but gives it an argument of the list of exceptions caught (like[None, None, RuntimeError, None, RuntimeError]). - Add an argument to
benchmark.pedanticto change the return type to a list of all results, then set up the benchmarked function so that it catches relevant exceptions and returns whatever I want. - Allow
extra_infokeys in the terminal table.
This would be great!
Is there currently a way to omit failed tests from the timing statistics? If we have nondeterminism and record a success rate, it might be desirable to only account for successful runs in the statistics.