examples
examples copied to clipboard
TypeError: can't pickle Environment objects
Hello,
I'm trying to run the dcgan/main.py file to train a GAN. I'm using a Windows 7 system with python 3.7 (anaconda)
I run the following line %run main.py --dataset lsun --dataroot bedroom_train_lmdb/ --niter 1
and I got the following
Namespace(batchSize=64, beta1=0.5, cuda=False, dataroot='bedroom_train_lmdb/', dataset='lsun', imageSize=64, lr=0.0002, manualSeed=None, ndf=64, netD='', netG='', ngf=64, ngpu=1, niter=1, nz=100, outf='.', workers=2) Random Seed: 482 Generator( (main): Sequential( (0): ConvTranspose2d(100, 512, kernel_size=(4, 4), stride=(1, 1), bias=False) (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU(inplace) (3): ConvTranspose2d(512, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU(inplace) (6): ConvTranspose2d(256, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (7): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (8): ReLU(inplace) (9): ConvTranspose2d(128, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (10): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (11): ReLU(inplace) (12): ConvTranspose2d(64, 3, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (13): Tanh() ) ) Discriminator( (main): Sequential( (0): Conv2d(3, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (1): LeakyReLU(negative_slope=0.2, inplace) (2): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): LeakyReLU(negative_slope=0.2, inplace) (5): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (6): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (7): LeakyReLU(negative_slope=0.2, inplace) (8): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (9): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (10): LeakyReLU(negative_slope=0.2, inplace) (11): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), bias=False) (12): Sigmoid() ) ) Traceback (most recent call last):
File "Y:\Research\Davide\ML\GAN\lsun-master\main.py", line 210, in
File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 819, in iter return _DataLoaderIter(self)
File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 560, in init w.start()
File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\multiprocessing\process.py", line 112, in start self._popen = self._Popen(self)
File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj)
File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in init reduction.dump(process_obj, to_child)
File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle Environment objects
It must be something related to windows. Any suggestions about how to solve this issue? Thanks
add the following lines to the end of the imports section, right after: import torchvision.utils as vutils
if __name__ == '__main__':
torch.multiprocessing.set_start_method('spawn')
ideally the script needs to be refactored to push everything into a main() function (that's the original problem)
I have the same error. With a different file but about the same.
I am seeing this when running it on Windows 10, it is solved when I set num_workers=0 for the DataLoader()
@soumith Hello, I have the same issue. And I have tried to set the multiprocessing start method to spawn, but it has no difference and the error still exists.
Could you please tell me another way to solve it?
I am seeing this when running it on Windows 10, it is solved when I set num_workers=0 for the DataLoader()
Perfect solution, but what are the specific reasons~
@soumith Can you elaborate on the issue here? The common factor in my code with this code for me is LMDB, and it produces the exact same error. Does this have something to do with trouble pickling the lmdb instance?
The issue is that you cannot pickle LMDB env objects. Setting num_workers=0 prevents the need to pickle anything since the main process original object handles retrieving data.
The real solution is to store the Environment variable in a class with a custom getitem() and setitem() functions that delete the LMDB Environment variable from the returned dictionary and then regenerate it when loaded.
I am seeing this when running it on Windows 10, it is solved when I set num_workers=0 for the DataLoader()
you saved me, man!! thanks.
I find some Github repos which use both LMDB and num_workers, and finally, successfully work. But I don't know why? You guys can find the examples here. stylegan2 dataset
@jgoodson @neillbyrne @ruotianluo @airsplay the solution