jetson-containers
jetson-containers copied to clipboard
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
reproduction path
- Run docker container
docker run -it --rm --net=host --runtime nvidia nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.8-py3
- Run
python3
session andimport torch
>>> import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 195, in <module>
_load_global_deps()
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 148, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
nvidia-jetpack specification
Package: nvidia-jetpack
Version: 5.0.1-b118
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-cuda (= 5.0.1-b118), nvidia-opencv (= 5.0.1-b118), nvidia-cudnn8 (= 5.0.1-b118), nvidia-tensorrt (= 5.0.1-b118), nvidia-container (= 5.0.1-b118), nvidia-vpi (= 5.0.1-b118), nvidia-nsight-sys (= 5.0.1-b118), nvidia-l4t-jetson-multimedia-api (>> 34.1-0), nvidia-l4t-jetson-multimedia-api (<< 34.2-0)
Homepage: http://developer.nvidia.com/jetson
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_5.0.1-b118_arm64.deb
Size: 29376
SHA256: d7ff0e4a95cc11c7a5d0b9e347923e8233ab544431d5db49d18c24944902e7a2
SHA1: fcab6ba9d6dca4a8b3e758d6fb1584baed34f7ed
MD5sum: f168d009bf5e3ee36ab14e646ad4b7dc
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8
Package: nvidia-jetpack
Version: 5.0-b114
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-cuda (= 5.0-b114), nvidia-opencv (= 5.0-b114), nvidia-cudnn8 (= 5.0-b114), nvidia-tensorrt (= 5.0-b114), nvidia-container (= 5.0-b114), nvidia-vpi (= 5.0-b114), nvidia-nsight-sys (= 5.0-b114), nvidia-l4t-jetson-multimedia-api (>> 34.1-0), nvidia-l4t-jetson-multimedia-api (<< 34.2-0)
Homepage: http://developer.nvidia.com/jetson
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_5.0-b114_arm64.deb
Size: 29370
SHA256: 3b5c14e3ed53cd2517d1a318d056aad3d8b44ff660a489a9b62825d518cf7c5b
SHA1: 608d1f78791a2bdda8bf88443796dfe99f19b199
MD5sum: dbcb9ff116c50b66d5270acd95e05f9a
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8
additional information
-
/usr/local/cuda/lib64/
does not contain library files.
root@ubuntu:/# ls /usr/local/cuda/lib64/
libcudadevrt.a libcudart_static.a stubs
- Default runtime is set to
nvidia
ubuntu@ubuntu:~$ docker info
Client:
Context: default
Debug Mode: false
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 53
Server Version: 20.10.12
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux nvidia runc
Default Runtime: nvidia
Init Binary: docker-init
containerd version:
runc version: 7cfd3bd
init version:
Security Options:
seccomp
Profile: default
Kernel Version: 5.10.65-tegra
Operating System: Ubuntu 20.04.4 LTS
OSType: linux
Architecture: aarch64
CPUs: 4
Total Memory: 14.56GiB
Name: ubuntu
ID: TSUV:CCRX:H2ZP:OR7L:E4SU:KG5S:RTJS:63BA:6UJB:DPKB:7EMK:CBV6
Docker Root Dir: /mnt/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Hi @SkalskiP , it seems you are using the new jetpack, so you should use the docker images for the new L4T too. Try this image: nvcr.io/nvidia/l4t-pytorch:r34.1.0-pth1.12-py3
Hi @CourchesneA the new one works. Doesn't it kind of defeat Docker's purpose? I would like to be able to run my container on different hosts, regardless of the OS that they are running. It is impossible with the new JetPack?
Well from what I understand, we are not exactly there yet for nvidia-docker. Specifically, CUDA was usually mounted from the host into the container, but for the jetson compatibility between different versions of CUDA in host and container was a problem. For the new L4T containers, CUDA is no longer mounted from host, it is contained in the images (hence the images are biggers). While this will solve some compatibility issues and restriction between host / container, I think this explains why jetpack 4.5 hosts are not compatible with jetpack 5 container and vice-versa.
Hi @CourchesneA, @SkalskiP, yes JetPack 5.x has migrated to having CUDA/cuDNN/TensorRT/ect installed into the container, so they are more portable. For example, you can run container images built for both JetPack 5.0 and 5.0.1 on JetPack 5.0.1 without needing to rebuild them.
As @CourchesneA, JetPack 4.x container images are not compatible with JetPack 5.x and would need re-built.
Hi, @dusty-nv! 👋 Hm... The main problem that I have is that I actually build my own custom docker image and it threw the same error. I think that rebuild of the image on the new host does not solve the issue.
In that case, are you sure the PyTorch wheel that is being used in the container is also compatible with your version of JetPack?
The wheels for JetPack 5.x are here:
- https://elinux.org/Jetson_Zoo#PyTorch_.28Caffe2.29
- https://forums.developer.nvidia.com/t/pytorch-for-jetson-version-1-11-now-available/72048