mmskeleton
mmskeleton copied to clipboard
CUDA RUNTIME ERROR when build_dataset_example
Hi,
I tried to build the example dataset using this command:
mmskl configs/utils/build_dataset_example.yaml --gpus 1
But got this cuda runtime error.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1565272271120/work/aten/src/THC/THCGeneral.cpp line=54 error=3 : initialization error
Process Process-4:
Traceback (most recent call last):
File "/home/manhh/miniconda3/envs/open-mmlab/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/manhh/miniconda3/envs/open-mmlab/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/manhh/github/mmskeleton/mmskeleton/processor/skeleton_dataset.py", line 21, in worker
detection_cfg, estimation_cfg, device=gpu)
File "/home/manhh/github/mmskeleton/mmskeleton/apis/estimation.py", line 30, in init_pose_estimator
detection_model = detection_model.cuda()
File "/home/manhh/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 311, in cuda
return self._apply(lambda t: t.cuda(device))
File "/home/manhh/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 208, in _apply
module._apply(fn)
File "/home/manhh/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 208, in _apply
module._apply(fn)
File "/home/manhh/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 230, in _apply
param_applied = fn(param)
File "/home/manhh/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 311, in
COuld you please help ? I guess it comes from the call Process
for i in range(num_worker):
p = Process(
target=worker,
args=(inputs, results, i % gpus, detection_cfg, estimation_cfg))
procs.append(p)
p.start()
But I am not sure,
The other commands work fine for me
Thank you,
I faced the same issue. first I edited the mmskl.py file -
if __name__ == "__main__":
torch.multiprocessing.set_start_method('spawn')
main()
then ran the below command -
mmskl configs/utils/build_dataset_example.yaml
here first I tried --gpus 0
with above command but it didn't work for me.
I encountered the same problem, is there a solution?
I faced the same issue. first I edited the mmskl.py file -
if __name__ == "__main__": torch.multiprocessing.set_start_method('spawn') main()
then ran the below command -
mmskl configs/utils/build_dataset_example.yaml
here first I tried--gpus 0
with above command but it didn't work for me.
I have the same problems. You just need to use multiprocessing instead of torch.multiprocessing. Here is my solution edited in mmskl.py file:
import multiprocessing as mp if __name__ == "__main__": mp.set_start_method('spawn') main()
It works for me!
How did you solve this problem guys?