GPU PyTorch slower than cpu

Open bsmy12 opened this issue 1 year ago • 1 comments

Hi, I am running the standard PyTorch TAPIR model on GPU but noticed that the inference speed is significantly slower than when using cpu. This seems backwards. I am running on 400 frames with only 2 query points. The frame size is 512x512x3. I see that it is in fact running on the GPU due to the usage percentage increasing and the memory increases as well. I am also running on a NVIDIA GeForce RTX 2080. Any guidance on what could be happening here?

CPU: ~325s to complete inference GPU: ~1025s to complete inference

Nov 15 '24 14:11 bsmy12

Hi @bsmy12 , I am trying to reproduce your problem using our colab, but I cannot.

Here is the runtime I got on a L4 GPU (T4 GPU is running out of memory) on your setup: 2 query points, 400 frames, 512x512x3 frame size. On L4 GPU, inference cost 9 seconds. Screenshot 2024-11-29 at 21 24 27

Unfortunately we don't have NVIDIA GeForce RTX 2080 at this moment. I suspect that an outdated or mismatched NVIDIA driver or CUDA toolkit can severely impact GPU performance.

Suggestion: Ensure the RTX 2080 system uses a compatible driver and CUDA version for the PyTorch setup. Check the PyTorch compatibility table https://github.com/pytorch/pytorch/blob/main/RELEASE.md#release-compatibility-matrix

Nov 29 '24 21:11 yangyi02