When run_custom.py, No CUDA GPUs are available
I am running as descried steps, and when I execute everything in order and execute run_custom.py, it first warns me that version `GLIBCXX_3.4.30' not found, and then I resolved it as described in #15 using the following command:
export LD_LIBRARY_PATH=/opt/conda/lib:${LD_LIBRARY_PATH}
However, this time it does not give me the above warning, but with a new one:
RuntimeError: No CUDA GPUs are available
I also tried to output available GPUs in the docker, and it shows 0 GPUs are avilable.
I am currently on RTX4090, and I am wondering whether this is because the default CUDA 11.3 version does not support it.
did you install nvidia docker container?
@wenbowen123 hi, I have the nvidia container toolkit installed, yet I experience the same problem on 3070Ti:
(py38) root@host:/BundleSDF# python run_custom.py --mode run_video --video_dir ./data/2022-11-18-15-10-24_milk --out_folder ./data/2022-11-18-15-10-24_milk_out --use_segmenter 1 --use_gui 1 --debug_level 2
[2024-09-17 13:03:41.688] [warning] [Bundler.cpp:49] Connected to nerf_port 9999
[2024-09-17 13:03:41.689] [warning] [FeatureManager.cpp:2084] Connected to port 5555
default_cfg {'backbone_type': 'ResNetFPN', 'resolution': (8, 2), 'fine_window_size': 5, 'fine_concat_coarse_feat': True, 'resnetfpn': {'initial_dim': 128, 'block_dims': [128, 196, 256]}, 'coarse': {'d_model': 256, 'd_ffn': 256, 'nhead': 8, 'layer_names': ['self', 'cross', 'self', 'cross', 'self', 'cross', 'self', 'cross'], 'attention': 'linear', 'temp_bug_fix': False}, 'match_coarse': {'thr': 0.2, 'border_rm': 2, 'match_type': 'dual_softmax', 'dsmax_temperature': 0.1, 'skh_iters': 3, 'skh_init_bin_score': 1.0, 'skh_prefilter': True, 'train_coarse_percent': 0.4, 'train_pad_num_gt_min': 200}, 'fine': {'d_model': 128, 'd_ffn': 128, 'nhead': 8, 'layer_names': ['self', 'cross'], 'attention': 'linear'}}
Traceback (most recent call last):
File "run_custom.py", line 223, in <module>
run_one_video(video_dir=args.video_dir, out_folder=args.out_folder, use_segmenter=args.use_segmenter, use_gui=args.use_gui)
File "run_custom.py", line 68, in run_one_video
tracker = BundleSdf(cfg_track_dir=cfg_track_dir, cfg_nerf_dir=cfg_nerf_dir, start_nerf_keyframes=5, use_gui=use_gui)
File "/BundleSDF/bundlesdf.py", line 318, in __init__
self.loftr = LoftrRunner()
File "/BundleSDF/loftr_wrapper.py", line 25, in __init__
self.matcher = self.matcher.eval().cuda()
File "/opt/conda/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 688, in cuda
return self._apply(lambda t: t.cuda(device))
File "/opt/conda/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 578, in _apply
module._apply(fn)
File "/opt/conda/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 578, in _apply
module._apply(fn)
File "/opt/conda/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 601, in _apply
param_applied = fn(param)
File "/opt/conda/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 688, in <lambda>
return self._apply(lambda t: t.cuda(device))
File "/opt/conda/envs/py38/lib/python3.8/site-packages/torch/cuda/__init__.py", line 216, in _lazy_init
torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
A guess is that the GPU driver / CUDA version are incompatible with the 11.3 setup in the container:
Sat Sep 14 23:25:55 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3070 Ti Off | 00000000:01:00.0 On | N/A |
| N/A 42C P3 22W / 115W | 328MiB / 8192MiB | 21% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
I will try rebuilding the container from a more recent nvidia image.