chapel
chapel copied to clipboard
[Feature Request]: Allow for testing against CUDA and HIP versions as baseline
Summary of Feature
Currently our testing infrastructure allows for comparing against a C version of a test, which is usually used to compare performance. It would be nice to have this ability to test against cuda and hip code as well for our GPU support.
Description: https://github.com/chapel-lang/chapel/pull/25614 inspired me to make this issue because it would be nice to have. There seems to be a lot of wiring needed to add tests to compare against CUDA, I wonder if our testing system could be expanded to allow us to easily test against CUDA or HIP versions for GPU code analogous to how we can compare against C versions It would probably help us add more tests like the ones in the linked PR (which, AFAIK, we would like to do) in the future with less initial effort and less maintenance.
Our testing system uses .test.c
file extensions for the tests that we want to run in our nightly suites.
We could use a .test.cu
or whichever appropriate extension for our CUDA/HIP code we want to test.
Is this issue currently blocking your progress? No
Sample
A simple setup for a test against a CUDA version that is compiled with nvcc
could be:
mytest/
| - gpuTest.chpl
| - // other gpuTest testing infra files like gpuTest.good, .compopts, .perfkeys, .whatever
| - gpuTest_cu.test.cu
| - // other gpuTest_cu testing infra files like gpuTest_cu.test.good, .compopts, .perfkeys, .whatever
| - gpuTest.graph // with the perfkeys listed like we normally have
Some challenges would be:
- How to deal with CUDA vs HIP code for testing? Infrastructure to skip tests automatically or do we need
skipifs
for each case when running on hardware with AMD vs Nvidia GPUs? - How to deal with multiple/different compilers (
nvcc
vshipcc
vsclang
) for different performance tests- Just use
.compopts
? Or create another file that is.compcmd
which just specifies the entire compilation command so we can choose the compiler in that way? Or something else integrated into the infrastructure?
- Just use
- Should we make it even more agnostic to allow for other things like intel GPU comparisons or even any other language comparison in the future?