easybuild-easyblocks Make the CUDA easyblock a bit more generic, to include code samples etc

In relation to #145:

We would like cuda.py to support also CUDA samples, v4.x and future versions too.

It basically implies that the easyconfig should have the freedom to define something like "osdependencies = 'libglut'", if asking for -samples in the installparams:

installparams = "-samplespath=%s/samples/ -toolkitpath=%s -samples -toolkit" % (self.installdir, self.installdir))

It also implies that installparams should likely be defined/driven from the easyconfig.

Mar 20 '13 04:03 fgeorgatos

(building and) installing the samples should be enabled by default, but easy to switch off

Mar 20 '13 05:03 boegel

This came up on the EasyBuild mailing list recently (by @ysagon), posting the workaround that @damianam mentioned there here in case people hit this:

postinstallcmds = [
    'cd %(installdir)s/samples && make SMS="35 37"'
]

This works, but the "35 37" needs to be tuned according the architectures for which the samples should be built.

Can we come up with something general that we can include in the CUDA easyblock, which should work for most sites?

What are these samples actually used for? (I'm clueless here...)

Mar 26 '19 19:03 boegel

Can we come up with something general that we can include in the CUDA easyblock, which should work for most sites?

I'm very new to EasyBuild, but ran across this as I work on bootstrapping our PoC cluster. For a sane default, SMS doesn't need to be specified explicitly at all. The Makefiles that ship with the samples set SMS to all the values supported by that version of the CUDA toolkit. It should thus be safe to have the default simply be:

postinstallcmds = [
     'cd %(installdir)s/samples && make'
]

At worst, this results in sample binaries that are slightly larger than necessary, containing pre-optimized kernels for more Shader Model architectures than necessary for the hardware in use. (It's even forward compatible with newer hardware since CUDA will JIT CUDA kernels to run on the latest hardware if an pre-optimized version isn't in the file. See: https://devblogs.nvidia.com/cuda-pro-tip-understand-fat-binaries-jit-caching/ )

Nicer still would be a way to have that make call use EasyBuild's current value of parallel, but I'm not familiar enough with EasyBuild to know how to accomplish that or if it's even possible. Something conceptually like ... && make -j %(parallel)

What are these samples actually used for? (I'm clueless here...)

Probably just about as clueless myself, but I do find that I use them as a quick verification step to ensure CUDA is working properly. Something like:

$ cd samples/bin/x86_64/linux/release/
$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 2 CUDA Capable device(s)
<snip>

Apr 02 '19 17:04 adiemu5

the issue has been 8 years overdue (!), however it got well handled in 4.4.0 ; ref. https://github.com/easybuilders/easybuild-easyblocks/pull/2452/files#diff-9fbfac5aba9b4f62631e81c73b400210bf31c31da365fee4993b982f746b8e3fR21

@boegel your take? if agreeing, let's close.

Jun 04 '21 16:06 fgeorgatos

Done in #2374

Mar 30 '24 10:03 branfosj

easybuild-easyblocks easybuild-easyblocks copied to clipboard

Make the CUDA easyblock a bit more generic, to include code samples etc

easybuild-easyblocks
easybuild-easyblocks copied to clipboard