jetson-containers icon indicating copy to clipboard operation
jetson-containers copied to clipboard

[Jetpack 4.4.1 - L4T 32.4.4] Import cv2 fails "ImportError: /usr/lib/aarch64-linux-gnu/libcublas.so.10: file too short"

Open chull434 opened this issue 2 years ago • 13 comments

Hi,

I previously had built a custom l4t-ml:r32.4.4-py3 with OpenCV 4.4.0 and Cuda. However now on a fresh install I keep getting the error below when I import cv2. Can't remember for the life of me what I did if anything or if it just worked previously, any pointers to resolver this greatly appreciated

>>> import cv2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/cv2/__init__.py", line 96, in <module>
    bootstrap()
  File "/usr/local/lib/python3.6/dist-packages/cv2/__init__.py", line 86, in bootstrap
    import cv2
ImportError: /usr/lib/aarch64-linux-gnu/libcublas.so.10: file too short

chull434 avatar Aug 09 '22 23:08 chull434

From other similar issues, my ls /etc/nvidia-container-runtime/host-files-for-container.d/ returns the below cuda.csv cudnn.csv l4t.csv tensorrt.csv visionworks.csv

Am I missing opencv.csv possible? where does one get that from?

chull434 avatar Aug 09 '22 23:08 chull434

Hi @chull434, you don't need an opencv.csv, because OpenCV is installed inside the container itself. However CUDA gets mounted in (when --runtime nvidia is used), and the /usr/lib/aarch64-linux-gnu/libcublas.so.10: file too short error indicates that mounting isn't working properly.

Are you starting the container with --runtime nvidia?

dusty-nv avatar Aug 09 '22 23:08 dusty-nv

Hi @chull434, you don't need an opencv.csv, because OpenCV is installed inside the container itself. However CUDA gets mounted in (when --runtime nvidia is used), and the /usr/lib/aarch64-linux-gnu/libcublas.so.10: file too short error indicates that mounting isn't working properly.

Are you starting the container with --runtime nvidia?

Hi, I am yes, with this fix applied https://github.com/dusty-nv/jetson-containers/issues/108#issuecomment-995090398

chull434 avatar Aug 10 '22 00:08 chull434

Hmm, okay. Is your fresh install of JetPack-L4T also on L4T r32.4.4?

If so, are you able to run this test command in l4t-base container:

sudo docker run -it --rm --net=host --runtime nvidia nvcr.io/nvidia/l4t-base:r32.4.4
python3 -c 'import tensorrt'

If it fails to import tensorrt due to some CUDA library issue, then it would seem that your nvidia runtime is still messed up, and I would recommend re-installing those packages. You should be able to find those with apt-cache search nvidia-container

dusty-nv avatar Aug 10 '22 00:08 dusty-nv

yeap running same version of jetpacl

nvcr.io/nvidia/l4t-base:r32.4.4 just does no module founds? nvcr.io/nvidia/l4t-ml:r32.4.4-py3 tensorrt and pycuda import not errors customer opencv and cuda /l4t-ml:r32.4.4-py3 tensorrt and pycuda import not errors but cv2 still gives back ImportError: /usr/lib/aarch64-linux-gnu/libcublas.so.10: file too short

chull434 avatar Aug 10 '22 00:08 chull434

So been trying to retrace my steps and rebuild the custom container to see if its still possible on this new install, getting the below error when it trys to import torch, not sure if this highlights a possible root cause of are issue or its just a different issue

Traceback (most recent call last):
  File "setup.py", line 13, in <module>
    import torch
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 188, in <module>
    _load_global_deps()
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 141, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory

chull434 avatar Aug 10 '22 10:08 chull434

OSError: libcurand.so.10: cannot open shared object file: No such file or directory

Typically this error occurs when either (a) the --runtime nvidia mounting isn't working properly or (b) you are installing a PyTorch wheel that's incompatible with your version of JetPack (i.e. has a different version of CUDA)

Can you confirm your L4T version again with cat /etc/nv_tegra_release ? Can you find libcurand.so.10 under /usr/local/cuda/lib64 ?

dusty-nv avatar Aug 10 '22 12:08 dusty-nv

cat /etc/nv_tegra_release # R32 (release), REVISION: 4.4, GCID: 23942405, BOARD: t186ref, EABI: aarch64, DATE: Fri Oct 16 19:37:08 UTC 2020

/usr/local/cuda/lib64$ ls -d libcurand* libcurand.so libcurand.so.10 libcurand.so.10.1.2.89 libcurand_static.a

chull434 avatar Aug 10 '22 13:08 chull434

Hmm...if you run ls -ll /usr/local/cuda/lib64/libcurand* inside the container, are those files valid?

It seems that something is wrong with your NVIDIA Container Runtime.

dusty-nv avatar Aug 10 '22 13:08 dusty-nv

hmmmm

ls -ll /usr/local/cuda/lib64/libcurand* ls: cannot access '/usr/local/cuda/lib64/libcurand*': No such file or directory

think i might be missing a number of files only 2 in there?

ls /usr/local/cuda/lib64 libcudadevrt.a libcudart_static.a stubs

chull434 avatar Aug 10 '22 14:08 chull434

ok yea, CUDA is not being mounted in. And if you already started the container with --runtime nvidia, there is a problem, and I would recommend re-installing the nvidia-container* packages or reflashing your device

dusty-nv avatar Aug 10 '22 14:08 dusty-nv

Yeap, I think this install is fubar, just getting the below now, time to reflash I think unless theres a way to fix this?

Setting up nvidia-l4t-bootloader (32.4.4-20201027211332) ...
ERROR. Unsupported board ID: .
Cannot install bootloader package. Exiting...
dpkg: error processing package nvidia-l4t-bootloader (--configure):
 installed nvidia-l4t-bootloader package post-installation script subprocess returned error exit status 1
Setting up linux-libc-dev:arm64 (4.15.0-191.202) ...
Errors were encountered while processing:
 nvidia-l4t-bootloader
E: Sub-process /usr/bin/dpkg returned an error code (1)

chull434 avatar Aug 10 '22 14:08 chull434

Ahh okay, yes I would reflash

After you reflash, try avoiding doing a sudo apt-get upgrade, and that should leave the NVIDIA Container Runtime intact so you don't have to do that other fix.

dusty-nv avatar Aug 10 '22 18:08 dusty-nv

Ahh okay, yes I would reflash


From: chull434 @.> Sent: Wednesday, August 10, 2022 10:45:27 AM To: dusty-nv/jetson-containers @.> Cc: Dustin Franklin @.>; Comment @.> Subject: Re: [dusty-nv/jetson-containers] [Jetpack 4.4.1 - L4T 32.4.4] Import cv2 fails "ImportError: /usr/lib/aarch64-linux-gnu/libcublas.so.10: file too short" (Issue #179)

Yeap, I think this install is fubar, just getting the below now, time to reflash I think unless theres a way to fix this?

Setting up nvidia-l4t-bootloader (32.4.4-20201027211332) ... ERROR. Unsupported board ID: . Cannot install bootloader package. Exiting... dpkg: error processing package nvidia-l4t-bootloader (--configure): installed nvidia-l4t-bootloader package post-installation script subprocess returned error exit status 1 Setting up linux-libc-dev:arm64 (4.15.0-191.202) ... Errors were encountered while processing: nvidia-l4t-bootloader E: Sub-process /usr/bin/dpkg returned an error code (1)

— Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdusty-nv%2Fjetson-containers%2Fissues%2F179%23issuecomment-1210779426&data=05%7C01%7Cdustinf%40nvidia.com%7Ccfb5b1de706f4087dcf308da7adef765%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637957395309213122%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=sYOY6XZ0TcGtOpZMg019z3v7ldi%2F4mB1d70bsiT0FDc%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADVEGK3BRFVWR6E67BKSNS3VYO6APANCNFSM56CO564Q&data=05%7C01%7Cdustinf%40nvidia.com%7Ccfb5b1de706f4087dcf308da7adef765%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637957395309213122%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bWDuEKzaD6YuLf6vFeuKGd470WaM1GqyHPR3FmON0Pk%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>

dusty-nv avatar Oct 11 '22 09:10 dusty-nv