nerfacc
nerfacc copied to clipboard
Runtime error in cudaGraphExecUpdate() from tiny-cuda-nn
Hi, I always get a weird error after some thousands of iteration when running this example, or the examples from this other repository:
terminate called after throwing an instance of 'std::runtime_error' what(): /tmp/pip-req-build-z4954kz1/include/tiny-cuda-nn/cuda_graph.h:124 cudaGraphExecUpdate(m_graph_instance, m_graph, &error_node, &update_result) failed with error the graph update was not performed because it included changes which violated constraints specific to instantiated graph update Aborted
After some debugging, I can say that it is not related to tiny-cuda-nn itself, as I can execute smoothly their training example. Also, the error disappears if I just replace the RGB output from your rendering
function with random values. I'm using PyTorch 1.13 with CUDA 11.6 and V100 cards. Another weird thing is that this error doesn't show up with Titan Xp cards (and the same PyTorch/CUDA versions).
Do you have any idea why this happens and how to solve it? Thank you in advance!
Seems like it's an hardware related issue?
I have no clue what could be the cause out of my head. But I believe if you replace the tiny-cuda-nn with a normal mlp, it should not have this issue. If that's the case it would still be somewhat related to tiny-cuda-nn.
Helps needed.