mmcv Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

Thanks for reporting the unexpected results and we appreciate it a lot.

Checklist

I have searched related issues but cannot get the expected help. yes
I have read the FAQ documentation but cannot get the expected help. yes
The unexpected results still exist in the latest version. no

Describe the Issue when i import mmcv and use python multiprocessing, i will get this Error; I understand why only import mmcv and not use mmcv will get this Error, this code will be normal when i no import mmcv; I know add torch.multiprocessing.set_start_method("spawn") will be normal, but i want know what the environment wil be change when i import mmcv

RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

This is my code

import multiprocessing as mp
import numpy as np
import torch
import sys

import traceback

from loguru import logger
import os

import mmcv

class TestProcess(mp.Process):
    def __init__(self, job_queue) -> None:
        super(TestProcess, self).__init__()
        self.job_queue = job_queue

    def run(self):

        while True:
            try:
                task = self.job_queue.get(timeout=5)
                img = torch.from_numpy(task).unsqueeze(0)
                img = img.cuda()
                logger.error(f"b6-{str(os.getpid())}-{str(os.getppid())}")

            except:
                traceback.print_exc()
                break


if __name__ == "__main__":
    # torch.multiprocessing.set_start_method("spawn")

    manager = mp.Manager()
    job_queue = manager.Queue(1000)
    logger.error(f"0-{str(os.getpid())}-{str(os.getppid())}")

    task_list = []

    for index in range(1):
        one = TestProcess(job_queue)
        task_list.append(one)

    for one_task in task_list:
        one_task.start()

    for index in range(10):
        one_data = np.random.random((100, 100, 3))
        job_queue.put(one_data)

    one.join()

Environment

{'sys.platform': 'linux', 'Python': '3.7.7 (default, Jan 22 2022, 21:27:43) [GCC 9.3.0]', 'CUDA available': True, 'GPU 0': 'NVIDIA GeForce GTX 1060', 'CUDA_HOME': '/usr/local/cuda', 'NVCC': 'Build cuda_11.3.r11.3/compiler.29745058_0', 'GCC': 'gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0', 'PyTorch': '1.9.0+cu102', 'PyTorch compiling details': 'PyTorch built with:\n  - GCC 7.3\n  - C++ Version: 201402\n  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications\n  - Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)\n  - OpenMP 201511 (a.k.a. OpenMP 4.5)\n  - NNPACK is enabled\n  - CPU capability usage: AVX2\n  - CUDA Runtime 10.2\n  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70\n  - CuDNN 7.6.5\n  - Magma 2.5.2\n  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, \n', 'TorchVision': '0.10.0+cu102', 'OpenCV': '4.5.5', 'MMCV': '1.4.7', 'MMCV Compiler': 'GCC 7.5', 'MMCV CUDA Compiler': '11.3'}

Error traceback

2022-05-20 19:15:47.236 | ERROR    | __main__:<module>:37 - 0-20563-7433
Traceback (most recent call last):
  File "test_issure.py", line 24, in run
    img = img.cuda()
  File "/home/nicken/.pyenv/versions/pytorch/lib/python3.7/site-packages/torch/cuda/__init__.py", line 163, in _lazy_init
    "Cannot re-initialize CUDA in forked subprocess. To use CUDA with "
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

May 20 '22 11:05 nicken

I try use same code in latest versio, I get same Error

Environment

{'sys.platform': 'linux', 'Python': '3.7.7 (default, Jan 22 2022, 21:27:43) [GCC 9.3.0]', 'CUDA available': True, 'GPU 0': 'NVIDIA GeForce GTX 1060', 'CUDA_HOME': '/usr/local/cuda', 'NVCC': 'Cuda compilation tools, release 11.3, V11.3.58', 'GCC': 'gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0', 'PyTorch': '1.11.0+cu113', 'PyTorch compiling details': 'PyTorch built with:\n  - GCC 7.3\n  - C++ Version: 201402\n  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications\n  - Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)\n  - OpenMP 201511 (a.k.a. OpenMP 4.5)\n  - LAPACK is enabled (usually provided by MKL)\n  - NNPACK is enabled\n  - CPU capability usage: AVX2\n  - CUDA Runtime 11.3\n  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86\n  - CuDNN 8.2\n  - Magma 2.5.2\n  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n', 'TorchVision': '0.12.0+cu113', 'OpenCV': '4.5.5', 'MMCV': '1.5.1', 'MMCV Compiler': 'GCC 9.4', 'MMCV CUDA Compiler': '11.3'}

May 20 '22 11:05 nicken

Hi @nicken , thanks for your report. We will try to reproduce the error.

May 21 '22 14:05 zhouzaida

I have reproduced the error. This error is a bit strange.

May 26 '22 14:05 zhouzaida

Did you see anything that might be a problem?

May 30 '22 11:05 nicken

I have a similar problem, have you solved it?

Aug 08 '22 09:08 tanghy2016

mmcv mmcv copied to clipboard

Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

mmcv
mmcv copied to clipboard