tiny-cuda-nn icon indicating copy to clipboard operation
tiny-cuda-nn copied to clipboard

Link Against tiny-cuda-nn in C++ Program

Open half-potato opened this issue 1 year ago • 2 comments

I didn't see any instructions on how to link against tiny-cuda-nn. In fact, it doesn't even have an install. I saw that iNGP just uses add_subdirectory, but I get the following error when I do that with my program:

/home/amai/splinetracers/support/tiny-cuda-nn/include/tiny-cuda-nn/common_device.h(96): error: no suitable constructor exists to convert from "float" to "__half"
   return (half)relu<float>((float)val);
                ^

/home/amai/splinetracers/support/tiny-cuda-nn/include/tiny-cuda-nn/vec.h(214): error: no suitable conversion function from "__half" to "float" exists
   return fmaf(a, b, c);
               ^

Inspecting the code around this error shows that there are some switches based on the CUDA arch, which might not be set correctly:

#ifdef __CUDACC__
inline TCNN_DEVICE __half fma(__half a, __half b, __half c) {
#if defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 600
	return __hfma(a, b, c);
#else
	return fmaf(a, b, c);
#endif
}
#endif

Based on the error, I don't think CUDA_ARCH is set correctly. I have an NVIDIA GTX3090 with CUDA 12.3. This corresponds to compute_86, which is greater than 60. I'm not sure why it's not finding the arch correctly. My CMakeList.txt file is pretty cursed though, as it is integrating Optix, CUDA, and Pytorch.

half-potato avatar Mar 19 '24 22:03 half-potato

I just wanted to add that tiny-cuda-nn does build outside of my project, just not inside it.

half-potato avatar Mar 19 '24 22:03 half-potato

You need to set a minimum architecture with TCNN_MIN_GPU_ARCH. E.g. I used -DTCNN_MIN_GPU_ARCH=75 to enable features on RTX 2000 series and later cards. And then make sure your cuda compiler is set to build for at least that instruction set.

I recently got TCNN building in our bazel project. See here for details: https://github.com/fbriggs/lifecast_public/commit/aaa1000ccfcd9bf94a143fc21300fe6607636342#diff-aef1b984940ceb407dca09e98a080c07a1cecbb1ae6b386caa0028e03e45bc48 - Note that this does not build any of the MLP code, only the encoders. It should just be a matter of adding the other .cu sources.

Also note: I basically had to reimplement all the functionality from the python bindings in C++ in order to interface with the C++ version of torch.

cogwheel avatar Jun 07 '24 15:06 cogwheel