DALI
DALI copied to clipboard
Cannot access CUDA GPU on WSL
Version
nvidia-dali-cuda120:1.37.1, nvidia-dali-nightly-cuda120 1.38.0.dev20240507
Describe the bug.
I've been following https://github.com/NVIDIA/DALI/issues/4663 and I'm seeing something similar but cannot figure out why. I can access my gpu on device 0 using nvidia-smi and I can access it using the same conda environment with pytorch so I'm unclear why dali cannot. This is inside a conda environment inside wsl on windows
Minimum reproducible example
Conda envionment:
name: multilabelimage_model_env
channels:
- pytorch
- nvidia
- conda-forge
- defaults
dependencies:
- python=3.11
- pytorch
- torchvision
- torchaudio
- pytorch-cuda=12.1
- opencv
- pandas
- scikit-learn=1.4.0
- wandb
- matplotlib
- tqdm
- pillow
- numpy
- scipy
- pyyaml
- pip
- pip:
- torch-summary
- tensorboard
- torch-tb-profiler
- torch-geometric
- timm
installed DALI using the official installation guide:
pip install --extra-index-url https://pypi.nvidia.com --upgrade nvidia-dali-cuda120
Also tried with nightly build
Tested with minimal example:
`import nvidia.dali as dali
import numpy as np
@dali.pipeline_def
def my_pipe():
return dali.fn.external_source(np.array([1,2,3], dtype=np.float32), batch=False).gpu()
pipe = my_pipe(batch_size=1, num_threads=1, device_id=1)
pipe.build()
print(pipe.run())
`
Relevant log output
Minimal example above gets error:
python dali_test.py
/root/miniconda3/envs/multilabelimage_model_env/lib/python3.11/site-packages/nvidia/dali/backend.py:99: Warning: nvidia-dali-cuda120 is no longer shipped with CUDA runtime. You need to install it separately. cuFFT is typically provided with CUDA Toolkit installation or an appropriate wheel. Please check https://docs.nvidia.com/cuda/cuda-quick-start-guide/index.html#pip-wheels-installation-linux for the reference.
deprecation_warning(
/root/miniconda3/envs/multilabelimage_model_env/lib/python3.11/site-packages/nvidia/dali/backend.py:110: Warning: nvidia-dali-cuda120 is no longer shipped with CUDA runtime. You need to install it separately. NPP is typically provided with CUDA Toolkit installation or an appropriate wheel. Please check https://docs.nvidia.com/cuda/cuda-quick-start-guide/index.html#pip-wheels-installation-linux for the reference.
deprecation_warning(
/root/miniconda3/envs/multilabelimage_model_env/lib/python3.11/site-packages/nvidia/dali/backend.py:121: Warning: nvidia-dali-cuda120 is no longer shipped with CUDA runtime. You need to install it separately. nvJPEG is typically provided with CUDA Toolkit installation or an appropriate wheel. Please check https://docs.nvidia.com/cuda/cuda-quick-start-guide/index.html#pip-wheels-installation-linux for the reference.
deprecation_warning(
Traceback (most recent call last):
File "/mnt/c/Coding/Testing/PyTorch/MultiLabelClassification_Patreon/actual_real_user_code/dali_test.py", line 8, in <module>
pipe.build()
File "/root/miniconda3/envs/multilabelimage_model_env/lib/python3.11/site-packages/nvidia/dali/pipeline.py", line 979, in build
self._init_pipeline_backend()
File "/root/miniconda3/envs/multilabelimage_model_env/lib/python3.11/site-packages/nvidia/dali/pipeline.py", line 813, in _init_pipeline_backend
self._pipe = b.Pipeline(
^^^^^^^^^^^
RuntimeError: CUDA runtime API error cudaErrorInvalidDevice (101):
invalid device ordinal
Other/Misc.
Found similar issues but could not find a solution
Check for duplicates
- [X] I have searched the open bugs/issues and have found no duplicates for this bug report
Hello @benchd, Please check your device id. You said you can access "device 0", but your DALI snippet specifies device 1.
pipe = my_pipe(batch_size=1, num_threads=1, device_id=1)
^^^^^^^^^^^