M3DSSD icon indicating copy to clipboard operation
M3DSSD copied to clipboard

CUDA version

Open kdheejb7 opened this issue 3 years ago • 2 comments

Hello,

Can you please let me know what version of cuda you used?

The first case, I used torch==0.4.1 and cuda 10.0. When I run the command python3 scripts/train_rpn_3d.py --config=kitti_3d_base --exp_name base I got the following error

  File "scripts/train_rpn_3d.py", line 324, in <module>
    main(args)
  File "scripts/train_rpn_3d.py", line 140, in main
    rpn_net, optimizer = init_training_model(conf, paths.output)
  File "/workspace/lib/core.py", line 69, in init_training_model
    network = absolute_import(dst_path)
  File "/workspace/lib/util.py", line 98, in absolute_import
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/workspace/output/base/20210923_152833/M3d_inference_align.py", line 5, in <module>
    from model.pose_dla_dcn import DLASeg, DeformConv
  File "/workspace/model/pose_dla_dcn.py", line 17, in <module>
    from .DCNv2.dcn_v2 import DCN
  File "/workspace/model/DCNv2/dcn_v2.py", line 11, in <module>
    from .dcn_v2_func import DCNv2Function
  File "/workspace/model/DCNv2/dcn_v2_func.py", line 9, in <module>
    from ._ext import dcn_v2 as _backend
  File "/workspace/model/DCNv2/_ext/dcn_v2/__init__.py", line 3, in <module>
    from ._dcn_v2 import lib as _lib, ffi as _ffi
ImportError: /workspace/model/DCNv2/_ext/dcn_v2/_dcn_v2.so: undefined symbol: __cudaPopCallConfiguration

I knew that this error is because of cuda and torch version mismatch, so I thought if I change the cuda version from 10.0 to 9.2, I can solve this error.

But I have another problem with cuda 9.2

The second case, I used torch==0.4.1 and cuda 9.2 When I run the command python3 scripts/train_rpn_3d.py --config=kitti_3d_base --exp_name base I got the following error

Traceback (most recent call last):
  File "scripts/train_rpn_3d.py", line 23, in <module>
    from lib.imdb_util import *
  File "/workspace/lib/imdb_util.py", line 24, in <module>
    from lib.rpn_util import *
  File "/workspace/lib/rpn_util.py", line 16, in <module>
    from lib.nms.gpu_nms import gpu_nms
ImportError: libcudart.so.10.0: cannot open shared object file: No such file or directory

This error occurred before reaching the code line that caused the first case error. So I cannot solve the problem about the command for training. I tried to use cuda 9.2 and 10.0, but each caused one problem.

Can you please let me know what version of cuda you used?

Thank you!

kdheejb7 avatar Sep 23 '21 15:09 kdheejb7

HY kdheejb7 Do u solve the version problem??

Wasiiiii avatar Oct 04 '21 09:10 Wasiiiii

你好,请问版本问题你解决了吗。我也有相似的问题。 @kdheejb7

123456789live avatar Dec 09 '21 07:12 123456789live