HVM
HVM copied to clipboard
adds dynamic shared mem allocation to cuda kernels
with this hvm.cu should work virtually on every NVIDIA GPU out there (assuming 5.0 CC and above). it dynamically allocates shared memory based on the GPU's capabilities, specifically 3072 less bytes than the max opt-in shared mem available, as some shared arrays use roughly (a little bit less than) that amount of shared mem.
since shared mem allocation needs to be known at compile time, get_shared_mem.cu
calculates the available shared mem in build time, which is ran by build.rs
that then generates a header file shared_mem_config.h
with the correct hex value for the local net.
Closes: #283 and #314 (supposedly)