pytorch-meta-dataset
pytorch-meta-dataset copied to clipboard
Exception when num_workers > 0 on Windows, works on linux
On Windows 10, if num_workers > 0
, you get the following exception:
Traceback (most recent call last):
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle generator objects
python-BaseException
Traceback (most recent call last):
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
python-BaseException
I believe this is due to the way my datasets are instantiated. For instance, when instantiating an EpisodicDataset, it creates a list of generators at https://github.com/mboudiaf/pytorch-meta-dataset/blob/5c4e85b149cf7079789190a6326c73bcc7efd1f6/pytorch_meta_dataset/pipeline.py#L100 . The problem is that generator objects cannot be pickled, which is exactly what he seems to be doing on Windows when multiprocessing is activated (i.e num_workers > 0). I suspect the way it works is that the dataset is created on the main worker, and then pickled for other processes to load.
So the workaround would be to remove this line and find a way to create the generator in the iter function (only when needed of course) and not the init . This should be doable with a try except. Given that I do not have Windows 10, I will unfortunately be unable to reproduce this error, but I would be happy to help debug it further :)
Thanks for clarifying. I have a linux machine as well, so I am not blocked. I may try your suggestion.
I have tried to fix the issue by implementing the initial workaround I proposed earlier. Please let me know if that fixes the issue on Windows ! Thanks in advance :)
Thanks for working on this. Unfortunately, there is still an issue on Windows, with num_workers > 0
, there is a new error:
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'Reader.construct_class_datasets.