Open3D icon indicating copy to clipboard operation
Open3D copied to clipboard

CUDA "driver shutting down" Error when built with shared libraries

Open MimetrikDE opened this issue 2 years ago • 1 comments

Checklist

Describe the issue

When Open3D is linked as a dynamic library, creating a Open3d::core::Tensor object using the default constructor and then reassigning the object to a new instance with a new data allocation results in the following error when the application is closed:

[Open3D Error] (void __cdecl open3d::core::__OPEN3D_CUDA_CHECK(enum cudaError,const char *,const int)) C:\Users\User\Repositories\Open3D\cpp\open3d\core\CUDAUtils.cpp:289: C:\Users\User\Repositories\Open3D\cpp\open3d\core\CUDAUtils.cpp:114 CUDA runtime error: driver shutting down

This causes the application to hang for several seconds likely due to a crash on exit

This issue does not occur when the same code is run with Open3D built and linked as a static library.

Repository with minimal CMake project to reproduce

Steps to reproduce the bug

1. Build Open3D with shared libraries as described below
2. Create a Open3D::core::Tensor object using the default consructor
3. Pass the Tensor object to a function by reference
4. Assign a new Tensor instance to the passed object with a allocation on the device ("CUDA:0")
5. Wait for program to exit
6. Application crashes before closing with the stated error message.

 
#include <Open3D/Open3D.h>
#include <Open3D/core/CUDAUtils.h>

void AssignNew(open3d::core::Tensor& testTensor) {
	testTensor = open3d::core::Tensor::Zeros({ 100,3 }, open3d::core::Dtype::Float32, open3d::core::Device("CUDA:0"));
}


int main() {

	std::cout << "Start" << std::endl;

	open3d::core::Tensor testTensor;
	AssignNew(testTensor);

	open3d::core::cuda::ReleaseCache();

	std::cout << "Finished program" << std::endl;

	return 0;
}

Error message

Start
Finished program
[Open3D Error] (void __cdecl open3d::core::__OPEN3D_CUDA_CHECK(enum cudaError,const char *,const int)) C:\Users\User\Repositories\Open3D\cpp\open3d\core\CUDAUtils.cpp:289: C:\Users\User\Repositories\Open3D\cpp\open3d\core\CUDAUtils.cpp:114 CUDA runtime error: driver shutting down

Expected behavior

For the given example code and CMake project I expect the application to replace the empty Tensor instance passed into the function with a new Tensor instance holding a device allocation and exit the program cleanly.

Open3D, Python and System information

## Open3D Build Flags
- BUILD_CUDA_MODULE : True
- BUILD_SHARED_LIBS : True
- BUILD_WEBRTC : False
- STATIC_WINDOWS_RUNTIME : False
- BUILD_PYTHON_MODULE : False

Built using 
- Visual Studio 17 2022
- CMake 3.25.1
- Windows 10 64-bit x86
- C++17
- Open3D Version: 0.17.0, Commit: 5b6ef4b04b1a4184f12a2c1181ad2b7d2fe45248

System information
- i9 12900H
- RTX 3080 (Laptop)
- 16GB DDR4 RAM

Additional information

None

MimetrikDE avatar Sep 29 '23 17:09 MimetrikDE

The issue does not happen for me on ubuntu 22.04, CUDA 12.3, Open3D latest main as of Apr. 11th 2024 (v0.18.0).

But I can reproduce the same error on Windows 11 with the following setup:

  • Visual Studio 2022, MSVC 19.29
  • Open3D latest main as of Apr. 11th 2024 (v0.18.0)
  • CUDA 12.0 (more recent versions cause building errors, see: https://github.com/isl-org/Open3D/issues/6743)

elias-Mimetrik avatar Apr 12 '24 10:04 elias-Mimetrik