Jetson-Nano-Ubuntu-20-image
Jetson-Nano-Ubuntu-20-image copied to clipboard
CUDA Installation failed in bare-bones Ubuntu 20.04. See log at /var/log/cuda-installer.log for details.
Hello everyone,
I'm encountering some challenges with bare-bones Ubuntu 20.04 image for installing CUDA. Has anyone come across similar issues? Here's the process I've been following:
1- Download the CUDA installer using the following command:
$ wget https://developer.download.nvidia.com/compute/cuda/11.6.2/local_installers/cuda_11.6.2_510.47.03_linux_sbsa.run
2- Run the installer with elevated privileges:
$ sudo sh cuda_11.6.2_510.47.03_linux_sbsa.run
Unfortunately, the installation failed, and I'm advised to check the log at /var/log/cuda-installer.log for more details. Any insights or solutions would be greatly appreciated.
$ cat /var/log/cuda-installer.log [INFO]: Driver not installed. [INFO]: Checking compiler version... [INFO]: gcc location: /usr/bin/gcc
[INFO]: gcc version: gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.2)
[INFO]: Initializing menu [INFO]: Setup complete [INFO]: Components to install: [INFO]: Driver [INFO]: 510.47.03 [INFO]: Executing NVIDIA-Linux-aarch64-510.47.03.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-check --install-libglvnd 2>&1 [INFO]: Finished with code: 36096 [ERROR]: Install of driver component failed. [ERROR]: Install of 510.47.03 failed, quitting
Sorry, you can not install CUDA 11 on a Jetson Nano, due to low-level incompatibility. The 'regular' CUDA version is 10 and is already installed. No need to use the CUDA installer. Assuming we are talking about the 'old' Jetson Nano, not the Orion
Thank you for your prompt response. I appreciate the clarification about CUDA compatibility on the Jetson Nano. However, when I check the CUDA version using nvcc --version, it seems that I can't find the installed CUDA version. Could you kindly provide guidance on how to resolve this issue? Thank you.
nvcc should be located in folder /usr/local/cuda/bin/. Please incorporate the location into your PATH string
Thank you for your generous help. I've successfully incorporated the changes into the bashrc file and verified the CUDA version is now visible.
Hello, I am reaching out for guidance based on the information provided in the following link: https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048.
To install PyTorch on my Jetson Nano, I've created a virtual environment using Python 3.6, as specified in the requirements for JetPack 4. However, during the installation process, I encountered the following error:
(py_env) jetson@nano:~$ python3 Python 3.6.15 (default, Nov 15 2023, 11:27:50) [GCC 9.4.0] on linux Type "help", "copyright", "credits" or "license" for more information.
import torch Traceback (most recent call last): File "
", line 1, in File "/home/jetson/vahid_ws/Jetson-Nano-OCR-Detection/build/config_virtualenv/py_env/lib/python3.6/site-packages/torch/init.py", line 195, in _load_global_deps() File "/home/jetson/vahid_ws/Jetson-Nano-OCR-Detection/build/config_virtualenv/py_env/lib/python3.6/site-packages/torch/init.py", line 148, in _load_global_deps ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL) File "/usr/local/lib/python3.6/ctypes/init.py", line 348, in init self._handle = _dlopen(self._name, mode) OSError: libmpi_cxx.so.20: cannot open shared object file: No such file or directory
I would greatly appreciate your advice on resolving this issue. Your assistance is invaluable to me at this stage.
Thank you in advance
Tip: ask chatGPT. It can give valuable answers. In your case:
The error you're encountering indicates that the libmpi_cxx.so.20
shared library cannot be found. This library is part of the Message Passing Interface (MPI) library. It seems like there might be an issue with your MPI installation or the environment variables related to it.
Here are a few steps you can take to address this issue:
-
Check MPI Installation: Make sure that MPI is correctly installed on your system. You may need to reinstall MPI or ensure that the required libraries are available. On a Debian-based system, you can use the following command to install MPI:
sudo apt-get install libopenmpi-dev
If you are using a different package manager or operating system, adjust the command accordingly.
-
Set Environment Variable: If MPI is correctly installed, you may need to set the
LD_LIBRARY_PATH
environment variable to include the directory wherelibmpi_cxx.so.20
is located. You can do this by adding the following line to your shell profile file (e.g.,~/.bashrc
or~/.bash_profile
):export LD_LIBRARY_PATH=/path/to/mpi/lib:$LD_LIBRARY_PATH
Replace
/path/to/mpi/lib
with the actual path to the directory containing the MPI libraries. -
Rebuild PyTorch: If you are using a virtual environment and installed PyTorch within that environment, consider deactivating the virtual environment, then reactivate it and reinstall PyTorch. This can sometimes resolve compatibility issues:
deactivate source py_env/bin/activate pip install torch
Make sure to replace
py_env
with the actual name of your virtual environment. -
Update PyTorch: Ensure that you are using the latest version of PyTorch. You can upgrade PyTorch using the following command:
pip install --upgrade torch
This will install the latest version of PyTorch and its dependencies.
After performing these steps, try running your Python script again. If the issue persists, there may be other system-specific factors at play, and additional troubleshooting may be needed.
Thank you for your message. I have successfully set the Environment Variable. To locate the libmpi, I used the following command: $ find / -name libmpi_cxx* 2>/dev/null /usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_cxx.so.40.20.1 /usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_cxx.so /usr/lib/aarch64-linux-gnu/libmpi_cxx.so.40.20.1 /usr/lib/aarch64-linux-gnu/libmpi_cxx.so /usr/lib/aarch64-linux-gnu/libmpi_cxx.so.40 Additionally, I've added the following line to the bashrc file to address the PyTorch installation issue: export LD_LIBRARY_PATH=/usr/lib/aarch64-linux-gnu:$LD_LIBRARY_PATH
Following the instructions from this link, I executed the following commands for PyTorch installation: $ wget https://nvidia.box.com/shared/static/p57jwntv436lfrd78inwl7iml6p13fzh.whl -O torch-1.8.0-cp36-cp36m-linux_aarch64.whl $ sudo apt-get install python3-pip libopenblas-base libopenmpi-dev libomp-dev $ pip3 install 'Cython<3' $ pip3 install numpy torch-1.8.0-cp36-cp36m-linux_aarch64.whl
The installation was successful with the following packages installed: Successfully installed Cython-0.29.36 Successfully installed dataclasses-0.8 numpy-1.19.5 torch-1.8.0 typing-extensions-4.1.1
However, I encountered an issue when attempting to install torchvision. Following the instructions here, I executed the following commands: $ sudo apt-get install libjpeg-dev zlib1g-dev libpython3-dev libopenblas-dev libavcodec-dev libavformat-dev libswscale-dev $ git clone --branch v0.9.0 https://github.com/pytorch/vision torchvision $ cd torchvision $ export BUILD_VERSION=0.9.0 $ python3 setup.py install --user
Unfortunately, I encountered the same error:
(py_env) jetson@nano:~/vahid_ws/Jetson-Nano-OCR-Detection/PyTorchJetson_JetPack4/torchvision$ python3 setup.py install --user
Traceback (most recent call last):
File "setup.py", line 12, in
I appreciate your assistance in resolving this issue. Any advice you can provide would be invaluable at this stage. Thank you in advance.
Thank you for your generous help. I've successfully incorporated the changes into the bashrc file and verified the CUDA version is now visible.
Can you share what commands did you use? facing the same issue