pyperformance
pyperformance copied to clipboard
Allow calculating geometric mean of groups of benchmarks based on tags
[Moved from https://github.com/faster-cpython/ideas/discussions/395]
It's becoming obvious that:
- The pyperformance suite needs more benchmarks that are more similar to real-world workloads, and we should lean into optimizing for these and using these to report progress.
- Microbenchmarks of a particular feature are also useful and belong in the benchmark suite, but we shouldn't over-optimize for them or use them as a (misleading) indicator of overall progress.
It seems that one way to address this would be to lean into "tags" more in the pyperformance/pyperf ecosystem. pyperformance already allows for tags in each benchmark's pyproject.yaml.
I propose we:
- Output the tags for each benchmark in the benchmark results in the
metadatadictionary. pyperfcompare_towould then calculate the geometric mean for each subset of benchmarks for each tag found in the results, as well as "all" benchmarks (existing behavior). This could be behind a flag if backward compatibility matters.
Alternatives:
We could instead use the nested benchmark heirarchy, rather than tags. Personally, I think tags is easier to understand and more flexible (a benchmark could be associated with multiple tags).
+1
IIRC, my intention with tags was for them to be an extension to benchmark groups in the manifest. However, I ran out of time to take it all the way.
(I also considered doing away with groups and instead having a different manifest per group. However, that ends up rather clunky and less user-friendly.)
Another thing we could do with tags/groups is split up a results file based on them.
Is this done now?