TextSnake.pytorch icon indicating copy to clipboard operation
TextSnake.pytorch copied to clipboard

RuntimeError: Cannot re-initialize CUDA in forked subprocess

Open dadadadashan opened this issue 4 years ago • 3 comments

Epoch: 0 : LR = 0.0001 Traceback (most recent call last): File "train_textsnake.py", line 238, in <module> main() File "train_textsnake.py", line 223, in main train(model, train_loader, criterion, scheduler, optimizer, epoch, logger) File "train_textsnake.py", line 63, in train for i, (img, train_mask, tr_mask, tcl_mask, radius_map, sin_map, cos_map, meta) in enumerate(train_loader): File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__ data = self._next_data() File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 79, in default_collate return [default_collate(samples) for samples in transposed] File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 79, in <listcomp> return [default_collate(samples) for samples in transposed] File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 64, in default_collate return default_collate([torch.as_tensor(b) for b in batch]) File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 64, in <listcomp> return default_collate([torch.as_tensor(b) for b in batch]) File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/cuda/__init__.py", line 148, in _lazy_init "Cannot re-initialize CUDA in forked subprocess. " + msg) RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method Does anyone know how to solve it? Thanks.

dadadadashan avatar Nov 02 '20 14:11 dadadadashan

did you solve the issue? I am facing the same problem working with colab.

bhumikasinghrk avatar Dec 19 '20 15:12 bhumikasinghrk

I am also facing same issue with colab. Did you find any solution for this ?

AmrutaAnalytics avatar Dec 24 '20 06:12 AmrutaAnalytics

@princewang1994 @bhumikasinghrk Did you get any solution for this problem ?

AmrutaAnalytics avatar Dec 24 '20 11:12 AmrutaAnalytics