examples icon indicating copy to clipboard operation
examples copied to clipboard

The Volatile GPU-Util is always 0, in examples/imagenet

Open wangxianrui opened this issue 6 years ago • 10 comments

I run the example of imagenet in https://github.com/pytorch/examples/tree/master/imagenet, althougt I can run it successfully, but it is slow, and the Volatile GPU-Util is always 0 with command 'nvidia-smi'

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.87                 Driver Version: 390.87                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:01:00.0  On |                  N/A |
| 31%   58C    P2    70W / 250W |   9584MiB / 11170MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0       947      G   /usr/lib/xorg/Xorg                           285MiB |
|    0      1752      G   compiz                                       154MiB |
|    0      1930      G   fcitx-qimpanel                                 9MiB |
|    0      4690      G   ...quest-channel-token=4115043597718524916    72MiB |
|    0     26519      C   python                                      9057MiB |
+-----------------------------------------------------------------------------+

wangxianrui avatar Dec 13 '18 06:12 wangxianrui

did you try increasing the num-workers ? maybe something like 16 ?

surgan12 avatar Dec 13 '18 08:12 surgan12

did you try increasing the num-workers ? maybe something like 16 ?

Yes, I have tried, but it doesn't work.

wangxianrui avatar Dec 13 '18 08:12 wangxianrui

what is the batch size that u r using ?

surgan12 avatar Dec 13 '18 10:12 surgan12

what is the batch size that u r using ?

256

wangxianrui avatar Dec 13 '18 10:12 wangxianrui

I sort of had the same problem but increasing the batch size and num workers did the trick for me

surgan12 avatar Dec 13 '18 10:12 surgan12

so what is the num-workers and batch-size you set?

wangxianrui avatar Dec 13 '18 10:12 wangxianrui

i set the batch size to something around 500 and num_workers as 16

surgan12 avatar Dec 13 '18 11:12 surgan12

Given that the power consumption is 70W, I would say the GPU is actually computing. I think is a bug of nvidia-smi, and I have the same behaviour.

garofas avatar Apr 18 '19 15:04 garofas

I run the example of imagenet in https://github.com/pytorch/examples/tree/master/imagenet, althougt I can run it successfully, but it is slow, and the Volatile GPU-Util is always 0 with command 'nvidia-smi'

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.87                 Driver Version: 390.87                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:01:00.0  On |                  N/A |
| 31%   58C    P2    70W / 250W |   9584MiB / 11170MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0       947      G   /usr/lib/xorg/Xorg                           285MiB |
|    0      1752      G   compiz                                       154MiB |
|    0      1930      G   fcitx-qimpanel                                 9MiB |
|    0      4690      G   ...quest-channel-token=4115043597718524916    72MiB |
|    0     26519      C   python                                      9057MiB |
+-----------------------------------------------------------------------------+

Hello, perhaps you know how to download the ImageNet dataset for this program to use? Please tell me, thank you very much!

tuji-sjp avatar Apr 22 '19 16:04 tuji-sjp

Does anyone solve this problem?

JiyueWang avatar Nov 26 '20 08:11 JiyueWang