CycleGAN-VC2 icon indicating copy to clipboard operation
CycleGAN-VC2 copied to clipboard

torch.cuda.is_available() is False but check the python cmd torch.cuda.is_available() is true

Open stm32f405 opened this issue 5 years ago • 7 comments

Dear friend,

PS D:\CycleGAN-VC2-master\CycleGAN-VC2-master> python train.py Traceback (most recent call last): File "train.py", line 520, in cycleGAN = CycleGANTraining(logf0s_normalization=logf0s_normalization, File "train.py", line 106, in init self.start_epoch = self.loadModel(restart_training_at) File "train.py", line 442, in loadModel checkPoint = torch.load(PATH) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\serialization.py", line 595, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\serialization.py", line 774, in _legacy_load result = unpickler.load() File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\serialization.py", line 730, in persistent_load deserialized_objects[root_key] = restore_location(obj, location) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\serialization.py", line 175, in default_restore_location result = fn(storage, location) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\serialization.py", line 151, in _cuda_deserialize device = validate_cuda_device(location) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\serialization.py", line 135, in validate_cuda_device raise RuntimeError('Attempting to deserialize object on a CUDA ' RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

but when i input the torch.cuda.is_available() under python cmd, the return value is true. this issue block me for days,any suggestion will work?

cannot thank more.

stm32f405 avatar Nov 20 '20 14:11 stm32f405

line 19 in train.py change os.environ["CUDA_VISIBLE_DEVICES"] = "3" to os.environ["CUDA_VISIBLE_DEVICES"] = "0" if you train with gpu

zhanima avatar Dec 04 '20 15:12 zhanima

thx,i'll try later

stm32f405 avatar Dec 04 '20 15:12 stm32f405

line 19 in train.py change os.environ["CUDA_VISIBLE_DEVICES"] = "3" to os.environ["CUDA_VISIBLE_DEVICES"] = "0" if you train with gpu

I tried ,but still RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

aylitat avatar Dec 31 '20 10:12 aylitat

line 19 in train.py change os.environ["CUDA_VISIBLE_DEVICES"] = "3" to os.environ["CUDA_VISIBLE_DEVICES"] = "0" if you train with gpu

I tried ,but still RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

Check nvidia-smi in cmd to see which gpu device are you using and fill in os.environ["CUDA_VISIBLE_DEVICES"] = "HERE". Check your environment path in your computer to see if os.environ["CUDA_VISIBLE_DEVICES"] = "X" is in it(this may not be necessary but i suggest you do so cause i did it). If this is your fist gpu based machine learning project, check your cuda and cudnn installation. they should be installed in pair with your gpu type number and fill the environment path with cuda and cudnn.

zhanima avatar Dec 31 '20 11:12 zhanima

If this is your fist gpu based machine learning project, check your cuda and cudnn installation. they should be installed in pair with your gpu type number and fill the environment path with cuda and cudnn.

I didn't undstan this step,may you explain it more? and what more , I have these problem both on PC and google colab. nomatter how, I will try it now. thx~

aylitat avatar Dec 31 '20 11:12 aylitat

If this is your fist gpu based machine learning project, check your cuda and cudnn installation. they should be installed in pair with your gpu type number and fill the environment path with cuda and cudnn.

I didn't undstan this step,may you explain it more? and what more , I have these problem both on PC and google colab. nomatter how, I will try it now. thx~

If you haven't successfully ran any ML program on gpu before, your cuda and cudnn may be installled in a wrong way. The version of cuda is strictly bonded with version of cudnn and with the version of your gpu driver. After you installed these you should fill the environment path of cudnn and cuda in your PC. there should be many tutorial on net and I suggest you to follow one of them to install your cuda and cudnn correctly. If any of above doesn't help, you may to check the torch version if it fits your cuda version.
I did not used google colab so there's nothing I can help.

zhanima avatar Dec 31 '20 11:12 zhanima

If this is your fist gpu based machine learning project, check your cuda and cudnn installation. they should be installed in pair with your gpu type number and fill the environment path with cuda and cudnn. I didn't undstan this step,may you explain it more? and what more , I have these problem both on PC and google colab. nomatter how, I will try it now. thx~

If you haven't successfully ran any ML program on gpu before, your cuda and cudnn may be installled in a wrong way. The version of cuda is strictly bonded with version of cudnn and with the version of your gpu driver. After you installed these you should fill the environment path of cudnn and cuda in your PC. there should be many tutorial on net and I suggest you to follow one of them to install your cuda and cudnn correctly. If any of above doesn't help, you may to check the torch version if it fits your cuda version. I did not used google colab so there's nothing I can help.

thanks,as you said,yesterday,i re-installed the cuda and it ok. but I dont know why it is print always: `Training resumed

Iter:4268192 Generator Loss:3.0806 Discrimator Loss:0.6706 GA2B:0.6292 GB2A:0.9648 G_id:0.4997 G_cyc:0.1487 D_A:0.1719 D_B:0.2680: : 0it [00:33, ?it/s]

0it [00:00, ?it/s] Iter:4268195 Generator Loss:3.5120 Discrimator Loss:0.9223 GA2B:0.7154 GB2A:0.9551 G_id:0.4838 G_cyc:0.1842 D_A:0.3829 D_B:0.4926: : 0it [00:00, ?it/s] Iter:4268197 Generator Loss:3.5302 Discrimator Loss:0.4369 GA2B:0.7681 GB2A:1.0000 G_id:0.5393 G_cyc:0.1762 D_A:0.0011 D_B:0.1506: : 0it [00:01, ?it/s] Iter:4268199 Generator Loss:3.2792 Discrimator Loss:0.4787 GA2B:0.8463 GB2A:1.0000 G_id:0.5136 G_cyc:0.1433 D_A:0.0669 D_B:0.2423: : 0it [00:02, ?it/s] Iter:4268201 Generator Loss:3.8264 Discrimator Loss:0.3022 GA2B:0.8485 GB2A:1.0000 G_id:0.5426 G_cyc:0.1978 D_A:0.0629 D_B:0.1363: : 0it [00:02, ?it/s] Iter:4268203 Generator Loss:3.3677 Discrimator Loss:0.4790 GA2B:0.8006 GB2A:1.0000 G_id:0.5217 G_cyc:0.1567 D_A:0.0411 D_B:0.1396: : 0it [00:03, ?it/s] Iter:4268205 Generator Loss:3.3915 Discrimator Loss:0.4330 GA2B:0.8750 GB2A:1.0000 G_id:0.4953 G_cyc:0.1517 D_A:0.0541 D_B:0.0666: : 0it [00:04, ?it/s] Iter:4268207 Generator Loss:3.7363 Discrimator Loss:0.4464 GA2B:0.8601 GB2A:1.0000 G_id:0.5513 G_cyc:0.1876 D_A:0.0149 D_B:0.1263: : 0it [00:05, ?it/s] Iter:4268209 Generator Loss:3.5288 Discrimator Loss:0.4888 GA2B:0.8604 GB2A:1.0000 G_id:0.4770 G_cyc:0.1668 D_A:0.0156 D_B:0.0785: : 0it [00:05, ?it/s]`

aylitat avatar Jan 01 '21 13:01 aylitat