No CUDA device found; using CPU as fallback.
First use and just at the first line [1]: GPU Configuration and Imports in the tutorial Sionna_Ray_Tracing_Introduction was not found. No CUDA device found; using CPU as fallback.
but !nvidia-smi print:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 550.90.07 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100-PCIE-40GB Off | 00000000:01:00.0 Off | 0 |
| N/A 42C P0 37W / 250W | 425MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA A100-PCIE-40GB Off | 00000000:81:00.0 Off | 0 |
| N/A 51C P0 47W / 250W | 1MiB / 40960MiB | 5% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
I use a container with the docker image. Other Rapids docker images works fine. drivers pb????
Hello @Fedomer,
Sionna uses Mitsuba for its ray tracing capabilities, which itself uses OptiX under the hood.
For OptiX to be able to be loaded, the Docker container needs to enable its support. I am not a Docker expert, but I think that enabling the graphics driver capabilities should help: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/docker-specialized.html#driver-capabilities
Hello @merlinND ,
thank's you I did it. I've created my container using the tutorial:
podman container create --name Sionna --device nvidia.com/gpu=all -it -p 8888:8888 --privileged=true --env NVIDIA_DRIVER_CAPABILITIES=graphics,compute,utility localhost/sionna:latest
NB: podman use the flags of docker and works fine for rapids images.
Glad it worked!
Hello @merlinND , I've done it but it did't work! I'm still investigating . I will try on a different hardware machine with different OS (Ubuntu 20.04, now I use RedHat enterprise 9.4 with podman)
the "No CUDA device found; " appears when I do : import sionna
could you please run this inside the docker container and give us the result?
python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
Hi @gmarcusm thanks,
# python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))" 2024-10-07 16:48:56.624726: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-10-07 16:48:56.624791: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-10-07 16:48:56.626089: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-10-07 16:48:56.632935: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU')]
also with import sionna:
`# python3
Python 3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import sionna 2024-10-07 16:59:29.563043: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-10-07 16:59:29.563161: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-10-07 16:59:29.564489: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-10-07 16:59:29.571596: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. No CUDA device found; using CPU as fallback.`
it seems that Tensorflow is not GPU enabled! but it's the official build with the dockerfile provided.
Upgraded news
Docker container seems load fine sionna package (with cuda) in a computer with Ubuntu 20.04LTS and Nvidia A5000 card with driver:
| NVIDIA-SMI 470.256.02 Driver Version: 470.256.02 CUDA Version: 12.3 |
but have that strange issue in a GPU rack server with dual A100 GPU powered by RedHat enterprise 9.4 and podman as container engine.
Driver in RH9.4 are:
| NVIDIA-SMI 550.90.07 Driver Version: 550.90.07 CUDA Version: 12.4 |
Other container with more recent tensorflow, Rapids works fine.
Still investigating......
** ... after some investigating** It's seeems that the problem belong to the container engine Podman. Using on different linux distribution with docker the environement works! Contacting Redhat for that issues is the next step.... still investigating.
Hi @Fedomer did you solve this issue? Were you able to get it to work on Redhat Linux? I am facing the same problem. TF by itself is able to find a GPU but once I pip install sionna it is not able to find a GPU anymore. Not sure if Sionna downgrades the TF version and messes up things in the process.
Hello @csankar69 , for the moment I'm using Docker because I Think it's a problem about GPU podman management or a bad configuration for Sionna. I'm waiting a new Server wit redHat and I will try again. RedHat can't solve the problem.
Closing due to inactivity