easybuild-easyblocks
easybuild-easyblocks copied to clipboard
Make the CUDA easyblock a bit more generic, to include code samples etc
In relation to #145:
We would like cuda.py to support also CUDA samples, v4.x and future versions too.
It basically implies that the easyconfig should have the freedom to define something like "osdependencies = 'libglut'", if asking for -samples in the installparams:
installparams = "-samplespath=%s/samples/ -toolkitpath=%s -samples -toolkit" % (self.installdir, self.installdir))
It also implies that installparams
should likely be defined/driven from the easyconfig.
(building and) installing the samples should be enabled by default, but easy to switch off
This came up on the EasyBuild mailing list recently (by @ysagon), posting the workaround that @damianam mentioned there here in case people hit this:
postinstallcmds = [
'cd %(installdir)s/samples && make SMS="35 37"'
]
This works, but the "35 37"
needs to be tuned according the architectures for which the samples should be built.
Can we come up with something general that we can include in the CUDA
easyblock, which should work for most sites?
What are these samples actually used for? (I'm clueless here...)
Can we come up with something general that we can include in the CUDA easyblock, which should work for most sites?
I'm very new to EasyBuild, but ran across this as I work on bootstrapping our PoC cluster. For a sane default, SMS doesn't need to be specified explicitly at all. The Makefiles that ship with the samples set SMS to all the values supported by that version of the CUDA toolkit. It should thus be safe to have the default simply be:
postinstallcmds = [
'cd %(installdir)s/samples && make'
]
At worst, this results in sample binaries that are slightly larger than necessary, containing pre-optimized kernels for more Shader Model architectures than necessary for the hardware in use. (It's even forward compatible with newer hardware since CUDA will JIT CUDA kernels to run on the latest hardware if an pre-optimized version isn't in the file. See: https://devblogs.nvidia.com/cuda-pro-tip-understand-fat-binaries-jit-caching/ )
Nicer still would be a way to have that make
call use EasyBuild's current value of parallel, but I'm not familiar enough with EasyBuild to know how to accomplish that or if it's even possible. Something conceptually like ... && make -j %(parallel)
What are these samples actually used for? (I'm clueless here...)
Probably just about as clueless myself, but I do find that I use them as a quick verification step to ensure CUDA is working properly. Something like:
$ cd samples/bin/x86_64/linux/release/
$ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 2 CUDA Capable device(s)
<snip>
the issue has been 8 years overdue (!), however it got well handled in 4.4.0
;
ref. https://github.com/easybuilders/easybuild-easyblocks/pull/2452/files#diff-9fbfac5aba9b4f62631e81c73b400210bf31c31da365fee4993b982f746b8e3fR21
@boegel your take? if agreeing, let's close.
Done in #2374