signaltrain
signaltrain copied to clipboard
Have you figured out why pytorch 1.2.0 will incur `cuda error re-init` without using multiprocessing?
ie. https://discuss.pytorch.org/t/not-using-multiprocessing-but-getting-cuda-error-re-forked-subprocess/54610
Thanks!
@sonack I'm so sorry for not responding. I seem to have a problem with my GitHub notifications.
I did not solve this issue, instead what I started doing was giving up on generating data 'on the fly' and instead generating a "files" dataset before training.
Apart from solving the issues with the multiprocessing workers and the conflict with pytorch, it was just more efficient in the long run, because then I can save time if I want to train on the same data multiple times.