dfdx icon indicating copy to clipboard operation
dfdx copied to clipboard

CUDA kernels JIT vs compile time compilation

Open coreylowman opened this issue 2 years ago • 4 comments

Should CUDA kernels be JIT compiled at runtime or somehow compiled when the program is built? Best case we can support both of these easily via a feature flag or something else. JIT would be nice for quicker builds, but pays cost at runtime.

coreylowman avatar Sep 12 '22 12:09 coreylowman

Related to #9

coreylowman avatar Sep 12 '22 12:09 coreylowman

Resource I'll post here from pytorch land: https://dev-discuss.pytorch.org/t/keeping-pytorchs-ops-maintainable-the-jiterator/468

coreylowman avatar Sep 12 '22 12:09 coreylowman

I think this can be done with two separate devices: CudaJIT and Cuda. They can share the underlyilng kernel code, but their impls can construct them differently

coreylowman avatar Sep 12 '22 19:09 coreylowman

rust-cuda currently does not include a binding to nvrtc, so this will have to be added

coreylowman avatar Sep 14 '22 12:09 coreylowman