Ardian Umam
Ardian Umam
-Update- Hi, it is solved. The problem is that some files are not completed during the cloning process and there is some errors during my installation `"sudo ./install.sh"`. For those...
Very useful feature. Will vote this up!
> found the issue. > > change all > > ``` > #define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x, " must be a CUDAtensor ") > #define CHECK_CONTIGUOUS(x) AT_CHECK(x.is_contiguous(), #x, " must be...
Hi, Thanks for the reply. I tried to run the `mkdir` just like mentioned above and it's OK. Then, I tried to resetup from beginning in my Ubuntu 18 server...
I have 9 GPU in the server.  Here is the output after removing --distributed from the script, and change deepspeed train.py ... to python3 train.py. I'm trying to investigate...
Many hanks for your reply :) I already modify so that all the related tensors are in cuda device. And now get this error. I already use batch_size=3 (small enough)...
Yes, I can run above code without error. It turns out that I install torch using cudatoolkit=10 meanwhile my nvdia-smi has cudatoolkit=11. After upgrade torch using the version of 11,...