[BUG] CGNN (Causal Graph Generation) + Usage of multiprocessing with pytorch

Open AmanTiwari1503 opened this issue 3 years ago • 1 comments

Describe the bug Usage of the multiprocessing library along with torch for multiple processes leads to the following error

p = cdt.causality.graph.CGNN().predict(df) An exhaustive search of the causal structure of CGNN without skeleton is super-exponential in the number of variables. A total of 3 graphs will be evaluated. Process Process-3: Process Process-2: Traceback (most recent call last): Traceback (most recent call last): File "/path/to/python/installation/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/path/to/python/installation/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/path/to/python/installation/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/path/to/python/installation/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/path/to/python/installation/site-packages/cdt/utils/parallel.py", line 54, in worker_subprocess output = function(*args, **kwargs, device=device, idx=idx) File "/path/to/python/installation/site-packages/cdt/utils/parallel.py", line 54, in worker_subprocess output = function(*args, **kwargs, device=device, idx=idx) File "/path/to/python/installation/site-packages/cdt/causality/graph/CGNN.py", line 197, in graph_evaluation obs = th.Tensor(scale(data.values)).to(device) File "/path/to/python/installation/site-packages/cdt/causality/graph/CGNN.py", line 197, in graph_evaluation obs = th.Tensor(scale(data.values)).to(device) File "/path/to/python/installation/site-packages/torch/cuda/init.py", line 208, in _lazy_init "Cannot re-initialize CUDA in forked subprocess. To use CUDA with " File "/path/to/python/installation/site-packages/torch/cuda/init.py", line 208, in _lazy_init "Cannot re-initialize CUDA in forked subprocess. To use CUDA with " RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

Possibly because of the parallel_run function in the cdt/utils/parallel.py file

Please mention - cdt version - 0.5.23 - Python version - 3.7.9 - PyTorch package version -1.12.0+cu102 - GPU used - Tesla V100 PCIe 16GB

Possible fixes Please refer to this stack overflow solution - https://stackoverflow.com/questions/48822463/how-to-use-pytorch-multiprocessing

Jul 28 '22 03:07 AmanTiwari1503

Hello ! You're right, I thought I solved this issue though... I'll check

Jul 29 '22 13:07 diviyank