Shivanshu Purohit
Results
5
comments of
Shivanshu Purohit
I have torch 1.8.1 and cuda 11.1 and I'm still getting this error even after installing the CUDA/C++ extensions
Did you manage to run on TPU?
No problem. But just fyi, you could fit it for cifar-10 probably with 8gb
How do you train that far? I'm using the deepspeed example and it terminates after 3k steps with seq_len 256, but at least until then the loss doesn't nan.
Did you find the solution? I have to write a function to fetch the experiment with the highest id. So mine is a similar problem