blog Question about graph-classification using GPU

I followed the code of Graph Classification, I tried to run the code on an A100 80G, Intel(R) Xeon(R) Gold 5320 CPU, with CUDA 11.1.

datasets                 2.11.0
transformers             4.28.1
torch                    2.0.0

I have also installed Cython and apex. The code is running, however it is slow. I observed by command nvidia-smi that the code took up about 21G of GPU memory, but the Volatile GPU-Util was always close to 0 and the code was expected to run for 53 hours, which is very different from the documentation of training/fine-tuning for 20 epochs on CPU (IntelCore i7)

  0%|▏                                           | 2/1020 [06:28<52:59:29, 187.40s/it]

I have set model = model.cuda() and dataloader_num_workers=8. Why does it run so slow on GPU？What else should I do to speed up the code training?

Looking forward to your reply and thanks for the blog.

Apr 27 '23 20:04 LyuLumos

Hi @LyuLumos ! I'm in a rush at the moment, but I will investigate this at the end of the month !

May 02 '23 15:05 clefourrier

Hi in my case it doesn't work at all if i send my model to cuda. How can i solve?

Jul 11 '23 06:07 daeyeoplee

Hi @daeyeoplee ! Can you send me your logs?

Jul 11 '23 07:07 clefourrier

@clefourrier this error keep pops up, but if i use only one sample then it works fine so i don't know what's the problem

Jul 11 '23 10:07 daeyeoplee

@clefourrier this error keep pops up, but if i use only one sample then it works fine so i don't know what's the problem

I have a method that may work.

Step 1. Load model to CPU to ensure there are no problems with your model. Step 2. Load your model onto just one GPU.

Jul 11 '23 16:07 LyuLumos

@clefourrier Thanks for your reply. Step1 is OK and it worked with batches of samples(not large batches cuz of memory and speed problem) Step2 didn't go well. I'm keep trying but now I got this result. I used os.environ['cuda visible device']=1 & device=torch.device('cuda:0') to use just one GPU

Jul 12 '23 00:07 daeyeoplee

blog blog copied to clipboard

Question about graph-classification using GPU

blog
blog copied to clipboard