celerity-runtime
celerity-runtime copied to clipboard
Microbenchmarks - work in progress
Work in progress open for discussion.
I currently vendored the args library directly into the benchmarks folder, which is probably not what we want. I do however expect these benchmarks to potentially have a lot of arguments, and I don't want to parse them ad hoc. I also saw that we already do just that in several of the examples
, but it might be a bit tricky to re-use a library across both that and the benchmarks when both are designed to be able to be built either independently or as part of the overall tree. I'm open for suggestions on how to best do this.
Catch2 ships with Clara, a small command line parsing library. Although it seems the standalone project has been discontinued, it is still alive and well as part of the Catch2 distribution. Could we use that instead of args.hxx?
After some offline discussion and trying out the different options, I now decided to include args
as a submodule, similar to spdlog
and catch2
. I think this is unproblematic, since it's a very small repo, comes with its own CMake file, and is header-only. I disabled the options which build its tests and example when it is included in the Celerity context.
More importantly, the tasks microbenchmark now does slightly more interesting stuff (build a tree and an upside-down tree), but I get some strange behaviour in terms of generated dependencies which I still need to look into.
Investigate re-using graph layout generators between isolated unit test microbenchmarks and this PR.
In this distributed setting, would be interesting to benchmark common communication patterns (gather / scatter / alltoall) as well as scalar reductions from host buffers against the optimized MPI collectives.
We do have some microbenchmarks by now; will revisit this in the future.