Tracking performance testing - collect and analyze results of benchmark runs
Related to #5070 2.5(Propose a mechanism to track performance over time):
- have a repo inside the kokkos organization where to store all the json files: let's call this the
database-repofor now- push there (somehow) all the json files generated by the benchmark for example every time a PR is merged into main branch
- have a script inside
database-repothat automatically generates plots every time files are updated
- When the Performance Test will be run on CI we can add an additional step which will upload all generated json files to
database-repo.
- There is a GitHub Action which can be used: https://github.com/marketplace/actions/push-a-file-to-another-repository
- This GA only needs a
API_TOKEN_GITHUBto be set inSecretssection of your repository options.
- After pushing benchmark results to
database-repothere will be GitHub Action run indatabase-repo. The sole purpose of that action will be to run the analyzing software that will generate the required output data. - To analyze the benchmark data we can use: https://github.com/bensanmorris/benchmark_monitor
- By default this tool will generate chart with benchmark runs and HTML index file based on the Jinja2 template. This template could be modified to include benchmark context data.
- Generated files can be pushed using GitHub Actions to GitHub Pages. Github Action that can be used: https://github.com/marketplace/actions/deploy-to-github-pages
Example chart:

TODO:
- I will create a quick demo on my repositories
Looks like it should "just work", thumbs up for using preexisting components!
thanks, look like a good solution! Some comments:
- i think we should also find a way to save plots with some filename convention
- we need to figure out if the propose solution is ok in terms of permissions etc , that is up to @crtrott @dalg24
after they give the ok, we can try to do the prototype
By default the images will be saved using following format BenchamrkName-metric.png so for example BM_SomeFunction-real_time.png.
If we want a different name format then we will need to modify the benchmark_monitor.py script.
My biggest concern is that we are nowadays mostly interested in performance on GPU backends and that the machines we are running CI on don't produce good/consistent enough results for performance regression testing.
Github action on database-repo can be configured to create new charts after any commit on branch(main). So it can be triggered by a commit from CI or by commit made manually.
We can make a separate directories to store benchmark results from CI and from target machines/proper benchmark machines.
Current status:
- gather benchmark results and push them to a dedicated repository:
- implemented on a performance-results-visualization branch, see the changes here
- using a fork of
dmnemec/copy_file_to_another_repo_action(customized for our needs) - pushes the jsons to kokkos-benchmark-results repository
- process the results:
- when the files get pushed to kokkos-benchmark-results, process them with
benchmark_monitor.pyand push generated content to a github pages instance - https://github.com/cz4rs/kokkos-benchmark-results/blob/master/.github/workflows/ci.yml
- when the files get pushed to kokkos-benchmark-results, process them with
- you can see the resulting graphs here: https://cz4rs.github.io/
- sample benchmark results with git info included
TODO:
- [x] use commit hashes to identify builds in graphs (~~after https://github.com/kokkos/kokkos/pull/5463 gets merged~~ edit: merged recently, work in progress)
- [x] use the metric that is prefixed with "FOM" (figure of merit) automatically
- potentially: ensure that all the benchmarks contain such metric on
kokkosside
- potentially: ensure that all the benchmarks contain such metric on
- [x] make it easier to search for specific results
Performance results are collected and stored in https://github.com/kokkos/kokkos-benchmark-results.