style-based-gan-pytorch
style-based-gan-pytorch copied to clipboard
cannot run due to pickle error
Not directly an issue with your project but rather asking for help. Every time I try to run train.py I get the error TypeError: cannot pickle 'Environment' object
as a result when trying to run inside a conda environment.
Could you try this? https://github.com/pytorch/examples/issues/526
so you mean directly at the beginning of the main function, not creating it and then inserting it?
Yes, right after imports.
unfortunately the error persists.
File "<string>", line 1, in <module>
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
(style-based-gan) C:\Users\dawe_\style-based-gan-pytorch>python train.py database
Traceback (most recent call last):
File "train.py", line 344, in <module>
train(args, dataset, generator, discriminator)
File "train.py", line 53, in train
data_loader = iter(loader)
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
w.start()
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\multiprocessing\context.py", line 326, in _Popen
return Popen(process_obj)
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
reduction.dump(process_obj, to_child)
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle 'Environment' object
(style-based-gan) C:\Users\dawe_\style-based-gan-pytorch>Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
This is the full error.
I think it is hard to use multiprocessing and lmdb in windows. It will be simpler to use num_workers=0 in DataLoader.
did it and now the error changed.
0%| | 0/3000000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "train.py", line 344, in <module>
train(args, dataset, generator, discriminator)
File "train.py", line 168, in train
fake_predict.backward()
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\site-packages\torch\tensor.py", line 195, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\site-packages\torch\autograd\__init__.py", line 97, in backward
Variable._execution_engine.run_backward(
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.
0%| | 0/3000000 [00:02<?, ?it/s]
so it started at least
Could you try CUDA_LAUNCH_BLOCKING=1?
Where? Right after the imports or in the main function?
Set it as environment variables before running training scripts. It will allow to spot exactly where error occurred.
(style-based-gan) C:\Users\dawe_\style-based-gan-pytorch>python train.py database
0%| | 0/3000000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "train.py", line 344, in <module>
train(args, dataset, generator, discriminator)
File "train.py", line 168, in train
fake_predict.backward()
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\site-packages\torch\tensor.py", line 195, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "C:\Users\dawe_\Miniconda3\envs\style-based-gan\lib\site-packages\torch\autograd\__init__.py", line 97, in backward
Variable._execution_engine.run_backward(
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.
0%| | 0/3000000 [00:01<?, ?it/s]
as far as I can tell the error stays the same
Sorry, I don't know exactly why the error occurs. Sometimes that error could be resolved by inserting .contiguous()
, but It is hard to know where it should be.
thank you for your patience.
is there anything i could do to increase verbosity or get the exact point of error otherwise or should I just append .contiguous()
randomly?
Also sometimes that error occurs when input sizes are too large. Maybe you can try to reduce batch sizes. But I don't know well about theses kind of errors.
thank you for your patience. is there anything i could do to increase verbosity or get the exact point of error otherwise or should I just append
.contiguous()
randomly?
I met the same problem,have you solved this problem?