sru
sru copied to clipboard
generate-dependencies-with-compile in RTX3060 Cuda11.1
I wonder the cuda implement of SRU work in CUDA11, I found in the forum that to solve the problem, we must use cuda10.2 https://discuss.pytorch.org/t/just-in-time-loading-and-compiling-cuda-kernels-was-unsuccesful/124486
Here is my computer info
Does the speed of SRU decrease if we just use operation of pytorch, not the opt write in c++, is there that version?
If there are someone solve problem run in 3060, can you share!
hi @v-nhandt21 , the compilation arguments such as "--generate-dependencies-with-compile" are automatically added by ninja
/ nvcc
.
Looking at your first screenshot, ninja/nvcc attempts to build the code using --sm_75
and --compute_75
architecture code. However, the correct code should be sm_86
for your RTX 3060, according to this article:
https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/#ampere-cuda-11-1-and-later
Maybe you can google search or ask in the Pytorch forum how to fix this arch code issue.