mitsuba3 icon indicating copy to clipboard operation
mitsuba3 copied to clipboard

Optix kernel build failure after rebuilding PTX of custom shapes

Open dvicini opened this issue 2 years ago • 5 comments

Hi,

I was trying to re-build the PTX kernels of the custom shapes. For this, I ran make in the resources/ptx directory. Upon rebuilding, Mitsuba then fails to compile the scene's optix kernel (line 201 in scene_optix.inl) (e.g., when running test_sphere.py)

I am using Linux with driver version 525.147.05 and CUDA 12.0. I downloaded the Optix 7 SDK to obtain the header needed for the custom shape compilation.

My compiled optix_rt.ptx has a few differences to the included one, but it's difficult to say for me what's really going on.

Any help would be greatly appreciated!

Here the diff on the optix_rt.ptx file: ptx_diff.txt

dvicini avatar Jan 05 '24 13:01 dvicini

Update: I think the problem is somehow related to the curve primitives. If I remove those, everything seems to work again. Maybe something changed regarding how Optix handles curves?

dvicini avatar Jan 05 '24 14:01 dvicini

We're compiling this kernel code with a fixed and intentionally old version of the CUDA SDK, specifically 10.2 (a bit like manylinux in the Python world). Can you give that one a try?

wjakob avatar Jan 05 '24 14:01 wjakob

I see, I will give that a try. It might be a bit tricky to downgrade in my setup.

Worst case, I can as a workaround for now disable the curve support. In that case, everything seems to again work as intended.

dvicini avatar Jan 05 '24 14:01 dvicini

It's easy to download the linux tarball and run 'nvcc' from there, no need to install anything AFAIK.

wjakob avatar Jan 05 '24 14:01 wjakob

I think CUDA itself is fine, but it seems challenging to get a compatible GCC 8 installed / set up here.

dvicini avatar Jan 05 '24 15:01 dvicini

Hi @dvicini

Double-check which nvcc version you're running. From the PTX header you've sent you're using CUDA 12.3, which generates PTX ISA v8.3. This doesn't match what you said in your original comment. PTX v8.3 is only supported for driver versions greater or equal to v545, this might explain the error message if you're indeed using v520.

In addition, in the diff you've sent I see that one of the OptiX functions for curves optixGetCurveParameter gets mapped to the PTX function call _optix_get_curve_parameter. This should only happen with OptiX 8.0 (see YOUR_OPTIX_FOLDER/include/internal/optix_device_impl.h ), previous versions would generate a call to _optix_get_attribute_0. I think this is the main culprit.

njroussel avatar Jan 08 '24 13:01 njroussel

Thanks for taking a look! The version issue makes sense. I now downgraded to CUDA 12.0. This still bumps the PTX ISA from 7.6 to 8.0.

My local OptiX SDK is 7.7, which already has the _optix_get_curve_parameter call. Do I need an older version there as well?

Edit: Seems like, I will try that

dvicini avatar Jan 09 '24 10:01 dvicini

You were right: By using CUDA 12.0 and OptiX 7.6 it all works again. Both of those need to be picked compatible to the current driver. Sorry, I had completely forgotten about the sensitivity of those to the driver. Thanks for the help!

dvicini avatar Jan 09 '24 10:01 dvicini