blog icon indicating copy to clipboard operation
blog copied to clipboard

Question about graph-classification using GPU

Open LyuLumos opened this issue 1 year ago • 6 comments

I followed the code of Graph Classification, I tried to run the code on an A100 80G, Intel(R) Xeon(R) Gold 5320 CPU, with CUDA 11.1.

datasets                 2.11.0
transformers             4.28.1
torch                    2.0.0

I have also installed Cython and apex. The code is running, however it is slow. I observed by command nvidia-smi that the code took up about 21G of GPU memory, but the Volatile GPU-Util was always close to 0 and the code was expected to run for 53 hours, which is very different from the documentation of training/fine-tuning for 20 epochs on CPU (IntelCore i7)

  0%|▏                                           | 2/1020 [06:28<52:59:29, 187.40s/it]

I have set model = model.cuda() and dataloader_num_workers=8. Why does it run so slow on GPU?What else should I do to speed up the code training?

Looking forward to your reply and thanks for the blog.

LyuLumos avatar Apr 27 '23 20:04 LyuLumos

Hi @LyuLumos ! I'm in a rush at the moment, but I will investigate this at the end of the month !

clefourrier avatar May 02 '23 15:05 clefourrier

Hi in my case it doesn't work at all if i send my model to cuda. How can i solve?

daeyeoplee avatar Jul 11 '23 06:07 daeyeoplee

Hi @daeyeoplee ! Can you send me your logs?

clefourrier avatar Jul 11 '23 07:07 clefourrier

@clefourrier image this error keep pops up, but if i use only one sample then it works fine so i don't know what's the problem

daeyeoplee avatar Jul 11 '23 10:07 daeyeoplee

@clefourrier image this error keep pops up, but if i use only one sample then it works fine so i don't know what's the problem

I have a method that may work.

Step 1. Load model to CPU to ensure there are no problems with your model. Step 2. Load your model onto just one GPU.

LyuLumos avatar Jul 11 '23 16:07 LyuLumos

@clefourrier Thanks for your reply. Step1 is OK and it worked with batches of samples(not large batches cuz of memory and speed problem) Step2 didn't go well. I'm keep trying but now I got this result. image I used os.environ['cuda visible device']=1 & device=torch.device('cuda:0') to use just one GPU

daeyeoplee avatar Jul 12 '23 00:07 daeyeoplee