nvbench
nvbench copied to clipboard
Add optional global setup/teardown feature to main
Adds an NVBENCH_ENVIRONMENT
macro that can be custom defined to a global, fixture-like class which will be created and destroyed in the nvbench main()
runtime. The definition should occur before including nvbench.cuh
. For example:
struct my_env {
my_env(int, char const* const*) {
printf("setup\n");
}
~my_env() {
printf("tear down\n");
}
};
#define NVBENCH_ENVIRONMENT my_env
This allows creating global CUDA resources (like an RMM memory pool) that can be initialized during the runtime before calling the benchmark functions and then destroyed once the benchmarks are complete. Calling CUDA APIs as part of initializing static global variables is undefined behavior and so any CUDA resources must be created (and destroyed) during the main()
runtime.
Defining the NVBENCH_ENVIRONMENT
is optional and so it will not break any existing implementations. Attaching a fixture-like object to an nvbench class would not work since the benchmark objects are instantiated each time they are used and the state objects can be copied (reinstantiated) many times. Adding a global function/class is not practical as mentioned above and multiple copies of the RMM pool is not possible -- the 2nd instantiation will report out-of-memory.
This solution is helpful for libcudf benchmarks where the RMM memory pool resource must be initialized once at runtime before making any libcudf calls. Setting up the pool on every iteration of a benchmark is slow and expensive and in some cases hampers profile analysis.
Reference issue: https://github.com/NVIDIA/nvbench/issues/78
Thanks @davidwendt. This approach is neat.
Out of curiosity I double checked how GTest does this and it looks like the primary downside is it doesn't allow layering multiple global setup/teardown steps.
It's OK to register multiple environment objects. In this suite, their SetUp() will be called in the order they are registered, and their TearDown() will be called in the reverse order.
I don't have a solid use case for that, but it seems like a worthwhile feature to emulate.
The trick here is that while GTest allows you to write your own main()
, nvbench does not.
@allisonvacanti may have some clever ideas about how to do this.
the primary downside is it doesn't allow layering multiple global setup/teardown steps.
I would be happy to build this kind solution in a later PR if this becomes a requirement. We could create a new issue to track it so that it is not forgotten. This current solution could possibly close some current issues #78, #100, and perhaps #99
Since
nvbench.main
is compiled once into an object file, and not per benchmark executable, this define doesn't work.The nvbench.main will always use
nvbench::no_environment
.
Sorry, I'm not following this. The intention was to insert it into the main()
function.
As long as it is defined in the object that contains main()
we should be ok?
Since
nvbench.main
is compiled once into an object file, and not per benchmark executable, this define doesn't work. The nvbench.main will always usenvbench::no_environment
.Sorry, I'm not following this. The intention was to insert it into the
main()
function. As long as it is defined in the object that containsmain()
we should be ok?
The object that contains main()
is the nvbench.main
CMake target when that is used by consumers. So it will always have the no_environment
version.
The only way to get this to work in a downstream project is to not use nvbench::main
but to have one of the test files call the NVBENCH_MAIN
macro directly.
Since
nvbench.main
is compiled once into an object file, and not per benchmark executable, this define doesn't work. The nvbench.main will always usenvbench::no_environment
.Sorry, I'm not following this. The intention was to insert it into the
main()
function. As long as it is defined in the object that containsmain()
we should be ok?The object that contains
main()
is thenvbench.main
CMake target when that is used by consumers. So it will always have theno_environment
version.The only way to get this to work in a downstream project is to not use
nvbench::main
but to have one of the test files call theNVBENCH_MAIN
macro directly.
I think we just update the # Global Setup and Tear Down
section to state that this feature is only usable when you call NVBENCH_MAIN
directly and don't use the nvbench::main
target.
This pull request requires additional validation before any workflows can run on NVIDIA's runners.
Pull request vetters can view their responsibilities here.
Contributors can view more details about this message here.
Merging in main and starting CI -- @davidwendt @robertmaynard is this version up-to-date with what cudf is using?
/ok to test
Merging in main and starting CI -- @davidwendt @robertmaynard is this version up-to-date with what cudf is using?
Yes, this matches our current patch in cudf. Reference: https://github.com/rapidsai/cudf/blob/branch-24.06/cpp/cmake/thirdparty/patches/nvbench_global_setup.diff
I just got around to reviewing this, sorry about the long delay.
I opened a patch over the weekend to make similar changes to how NVBench's main
can be customized by providing a handful of reusable hooks: #165.
It looks like that PR provides equivalent functionality through the NVBENCH_MAIN_INITIALIZE_CUSTOM_POST
hook -- creating your environment RAII object here should do the trick. Would that be a suitable solution for your usecase here?
The only way to get this to work in a downstream project is to not use nvbench::main but to have one of the test files call the NVBENCH_MAIN macro directly.
Yes, the only way to use the customized main is to define your own, either in one of the test files or an object library. The nvbench::nvbench
target can be used instead of nvbench::main
to avoid conflicting main
s.