examples icon indicating copy to clipboard operation
examples copied to clipboard

TypeError: can't pickle Environment objects

Open dbrivio opened this issue 6 years ago • 11 comments

Hello,

I'm trying to run the dcgan/main.py file to train a GAN. I'm using a Windows 7 system with python 3.7 (anaconda)

I run the following line %run main.py --dataset lsun --dataroot bedroom_train_lmdb/ --niter 1

and I got the following

Namespace(batchSize=64, beta1=0.5, cuda=False, dataroot='bedroom_train_lmdb/', dataset='lsun', imageSize=64, lr=0.0002, manualSeed=None, ndf=64, netD='', netG='', ngf=64, ngpu=1, niter=1, nz=100, outf='.', workers=2) Random Seed: 482 Generator( (main): Sequential( (0): ConvTranspose2d(100, 512, kernel_size=(4, 4), stride=(1, 1), bias=False) (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU(inplace) (3): ConvTranspose2d(512, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU(inplace) (6): ConvTranspose2d(256, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (7): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (8): ReLU(inplace) (9): ConvTranspose2d(128, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (10): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (11): ReLU(inplace) (12): ConvTranspose2d(64, 3, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (13): Tanh() ) ) Discriminator( (main): Sequential( (0): Conv2d(3, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (1): LeakyReLU(negative_slope=0.2, inplace) (2): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): LeakyReLU(negative_slope=0.2, inplace) (5): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (6): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (7): LeakyReLU(negative_slope=0.2, inplace) (8): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False) (9): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (10): LeakyReLU(negative_slope=0.2, inplace) (11): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), bias=False) (12): Sigmoid() ) ) Traceback (most recent call last):

File "Y:\Research\Davide\ML\GAN\lsun-master\main.py", line 210, in for i, data in enumerate(dataloader, 0):

File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 819, in iter return _DataLoaderIter(self)

File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 560, in init w.start()

File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\multiprocessing\process.py", line 112, in start self._popen = self._Popen(self)

File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj)

File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj)

File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in init reduction.dump(process_obj, to_child)

File "C:\Users\db396\AppData\Local\Continuum\anaconda3\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj)

TypeError: can't pickle Environment objects

It must be something related to windows. Any suggestions about how to solve this issue? Thanks

dbrivio avatar Mar 15 '19 19:03 dbrivio

add the following lines to the end of the imports section, right after: import torchvision.utils as vutils

if __name__ == '__main__':
    torch.multiprocessing.set_start_method('spawn')

soumith avatar Mar 29 '19 04:03 soumith

ideally the script needs to be refactored to push everything into a main() function (that's the original problem)

soumith avatar Mar 29 '19 04:03 soumith

I have the same error. With a different file but about the same.

dave7895 avatar Mar 16 '20 11:03 dave7895

I am seeing this when running it on Windows 10, it is solved when I set num_workers=0 for the DataLoader()

yptheangel avatar Mar 28 '20 13:03 yptheangel

@soumith Hello, I have the same issue. And I have tried to set the multiprocessing start method to spawn, but it has no difference and the error still exists.

Could you please tell me another way to solve it?

theodoruszq avatar Jun 15 '20 10:06 theodoruszq

I am seeing this when running it on Windows 10, it is solved when I set num_workers=0 for the DataLoader()

Perfect solution, but what are the specific reasons~

rsqai avatar Aug 06 '20 02:08 rsqai

@soumith Can you elaborate on the issue here? The common factor in my code with this code for me is LMDB, and it produces the exact same error. Does this have something to do with trouble pickling the lmdb instance?

jerinphilip avatar Sep 13 '20 17:09 jerinphilip

The issue is that you cannot pickle LMDB env objects. Setting num_workers=0 prevents the need to pickle anything since the main process original object handles retrieving data.

The real solution is to store the Environment variable in a class with a custom getitem() and setitem() functions that delete the LMDB Environment variable from the returned dictionary and then regenerate it when loaded.

jgoodson avatar Oct 26 '20 15:10 jgoodson

I am seeing this when running it on Windows 10, it is solved when I set num_workers=0 for the DataLoader()

you saved me, man!! thanks.

pritamqu avatar Dec 17 '20 22:12 pritamqu

I find some Github repos which use both LMDB and num_workers, and finally, successfully work. But I don't know why? You guys can find the examples here. stylegan2 dataset

clipBert dataset

@jgoodson @neillbyrne @ruotianluo @airsplay the solution

shoutOutYangJie avatar Nov 26 '21 13:11 shoutOutYangJie