signaltrain icon indicating copy to clipboard operation
signaltrain copied to clipboard

Have you figured out why pytorch 1.2.0 will incur `cuda error re-init` without using multiprocessing?

Open sonack opened this issue 4 years ago • 1 comments

ie. https://discuss.pytorch.org/t/not-using-multiprocessing-but-getting-cuda-error-re-forked-subprocess/54610

Thanks!

sonack avatar Nov 16 '19 11:11 sonack

@sonack I'm so sorry for not responding. I seem to have a problem with my GitHub notifications. I did not solve this issue, instead what I started doing was giving up on generating data 'on the fly' and instead generating a "files" dataset before training.
Apart from solving the issues with the multiprocessing workers and the conflict with pytorch, it was just more efficient in the long run, because then I can save time if I want to train on the same data multiple times.

drscotthawley avatar Jun 20 '20 18:06 drscotthawley