jetson-containers icon indicating copy to clipboard operation
jetson-containers copied to clipboard

cudnn + tensorrt container

Open jhhurwitz opened this issue 5 years ago • 17 comments

Thanks for putting this together!

Do you have any plans to add a tensorrt container (with cudnn)?

I'm looking to use an l4t-container to build an application that depends on tensorrt (in this case 7.6.3) for the jetson nano.

jhhurwitz avatar May 09 '20 01:05 jhhurwitz

hi @jhhurwitz, JetPack 4.4 has cuDNN and TensorRT mounted into the l4t-base container, along with the headers.

dusty-nv avatar May 09 '20 01:05 dusty-nv

Hi, yes, I am aware of this. I am hoping to build applications for the jetson nano without a dependency on the JetPack (from an x86 machine (with qemu) or a more powerful ARM machine). The l4t container looked like a good place to start.

Apologies if I'm misunderstanding something! Thanks.

jhhurwitz avatar May 09 '20 01:05 jhhurwitz

Hi @dusty-nv I pulled the image by docker pull nvcr.io/nvidia/l4t-pytorch:r32.4.2-pth1.5-py3 and the ran the container by sudo docker run -it --rm --network host nvcr.io/nvidia/l4t-pytorch:r32.4.2-pth1.5-py3 but there is no libcudnn or cudnn header files in /usr/include. Are these missed in this image?

MoonBlvd avatar May 31 '20 19:05 MoonBlvd

but there is no libcudnn or cudnn header files in /usr/include

You need to use --runtime nvidia during your docker run command, or set the default runtime to nvidia as shown here: https://github.com/dusty-nv/jetson-containers#docker-default-runtime

The CUDA/cuDNN/ect header files and libraries are automatically mapped into the container by the nvidia docker runtime.

dusty-nv avatar Jun 01 '20 14:06 dusty-nv

Also I believe the headers would then be found under /usr/include/aarch64-linux-gnu and /usr/local/cuda/include

dusty-nv avatar Jun 01 '20 14:06 dusty-nv

Following up on this - run the following:

$ sudo apt-get install dos2unix
$ sudo dos2unix  /etc/nvidia-container-runtime/host-files-for-container.d/cudnn.csv 

Then reboot or restart your docker service, and you should then see the cudnn libraries/headers in the containers.

dusty-nv avatar Jun 04 '20 01:06 dusty-nv

Following up on this - run the following:

$ sudo apt-get install dos2unix
$ sudo dos2unix  /etc/nvidia-container-runtime/host-files-for-container.d/cudnn.csv 

Then reboot or restart your docker service, and you should then see the cudnn libraries/headers in the containers.

Hello @dusty-nv! I am also trying to build Tensorflow 2.2.0 from sources for Jetson Nano in docker container. I setup my default container-runtime to nvidia. I did the above steps, but files such as cudnn.h still does not exist in /usr/include directory. Am I missing something?

myaldiz avatar Jun 26 '20 17:06 myaldiz

Hi @myaldiz, try adding this command to your dockerfile:

RUN ln -s /usr/include/aarch64-linux-gnu/cudnn_v8.h /usr/include/cudnn.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_version_v8.h /usr/include/cudnn_version.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_backend_v8.h /usr/include/cudnn_backend.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_infer_v8.h /usr/include/cudnn_adv_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_train_v8.h /usr/include/cudnn_adv_train.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_infer_v8.h /usr/include/cudnn_cnn_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_train_v8.h /usr/include/cudnn_cnn_train.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_infer_v8.h /usr/include/cudnn_ops_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_train_v8.h /usr/include/cudnn_ops_train.h && \
    ls -ll /usr/include/cudnn*

dusty-nv avatar Jun 26 '20 17:06 dusty-nv

Hi @myaldiz, try adding this command to your dockerfile:

RUN ln -s /usr/include/aarch64-linux-gnu/cudnn_v8.h /usr/include/cudnn.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_version_v8.h /usr/include/cudnn_version.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_backend_v8.h /usr/include/cudnn_backend.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_infer_v8.h /usr/include/cudnn_adv_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_train_v8.h /usr/include/cudnn_adv_train.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_infer_v8.h /usr/include/cudnn_cnn_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_train_v8.h /usr/include/cudnn_cnn_train.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_infer_v8.h /usr/include/cudnn_ops_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_train_v8.h /usr/include/cudnn_ops_train.h && \
    ls -ll /usr/include/cudnn*

Thank you so much! That solved the issue, but I am having another compilation problem now:

ERROR: /root/tensorflow/tensorflow/core/kernels/BUILD:6109:1: C++ compilation of rule '//tensorflow/core/kernels:training_ops' failed (Exit 1)
aarch64-linux-gnu-gcc-8: fatal error: Killed signal terminated program cc1plus
compilation terminated.
Target //tensorflow/tools/pip_package:build_pip_package failed to build

I tried gcc-8 for compilation, but it did not help. Here is the docker file I try to build (please ignore unnecessary apt-get's):

FROM nvcr.io/nvidia/l4t-base:r32.4.2

RUN export DEBIAN_FRONTEND=noninteractive \
  && apt-get update \
  && apt-get upgrade -y \
  && apt-get install -y \
    vim git wget cmake build-essential libssl-dev \
    g++ gphoto2 libgphoto2-dev doxygen \ 
    libbz2-dev unzip libpng-dev \
    libtbb2 libtbb-dev libtiff5-dev libv4l-dev \
    libicu-dev autotools-dev \
    libhdf5-serial-dev hdf5-tools \
    libhdf5-dev zlib1g-dev zip libjpeg8-dev \
    liblapack-dev libblas-dev gfortran \
    libavcodec-dev libavformat-dev libavutil-dev \
    libeigen3-dev libglew-dev libgtk2.0-dev \
    libgtk-3-dev libjpeg-dev libpostproc-dev \
    libswscale-dev libxvidcore-dev libx264-dev \
    qt5-default pkg-config openjdk-11-jdk\
  && apt-get install -y \
    python3-dev python3-pip python3-numpy python3-py \
    python3-setuptools python3-pytest python3-opencv\
  && apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev \
  && apt-get autoclean && apt-get clean \
  && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

RUN python3 -m pip install --upgrade pip \
  && python3 -m pip install -U pip testresources setuptools \
  && python3 -m pip install -U \
    'numpy<1.19.0' future mock \
    h5py keras_preprocessing \
    keras_applications gast futures \
    protobuf pybind11 jupyter tqdm matplotlib \
  && python3 -m pip install -U six wheel setuptools mock

RUN cd ~ \
  && wget https://github.com/bazelbuild/bazel/releases/download/3.1.0/bazel-3.1.0-dist.zip \
  && unzip bazel-3.1.0-dist.zip -d bazel \
  && cd bazel \
  && EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" ./compile.sh \
  && cp output/bazel /usr/local/bin
  
RUN ln -s /usr/include/aarch64-linux-gnu/cudnn_v8.h /usr/include/cudnn.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_version_v8.h /usr/include/cudnn_version.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_backend_v8.h /usr/include/cudnn_backend.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_infer_v8.h /usr/include/cudnn_adv_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_train_v8.h /usr/include/cudnn_adv_train.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_infer_v8.h /usr/include/cudnn_cnn_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_train_v8.h /usr/include/cudnn_cnn_train.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_infer_v8.h /usr/include/cudnn_ops_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_train_v8.h /usr/include/cudnn_ops_train.h && \
    ls -ll /usr/include/cudnn* \
    && sh -c "echo '/usr/local/cuda/lib64' >> /etc/ld.so.conf.d/nvidia.conf" \
    && ldconfig

RUN cd ~ \
  && git clone --recursive --branch "v2.3.0-rc0" https://github.com/tensorflow/tensorflow.git

RUN cd ~/tensorflow \
  && PYTHON_BIN_PATH=$(which python3) \
  PYTHON_LIB_PATH=$(python3 -c 'import site; print(site.getsitepackages()[0])') \
  TF_NEED_OPENCL_SYCL=0 \
  TF_NEED_OPENCL=0 \
  TF_NEED_ROCM=0 \
  TF_NEED_CUDA=1 \
  TF_NEED_TENSORRT=1 \
  TF_CUDA_VERSION=10.2 \
  TF_TENSORRT_VERSION=7 \
  CUDA_TOOLKIT_PATH=/usr/local/cuda \
  CUDNN_INSTALL_PATH=/usr/lib/aarch64-linux-gnu \
  TENSORRT_INSTALL_PATH=/usr/lib/aarch64-linux-gnu \
  TF_CUDA_COMPUTE_CAPABILITIES=5.3 \
  TF_CUDA_CLANG=0 \
  TF_NEED_IGNITE=0 \
  TF_ENABLE_XLA=0 \
  TF_NEED_MPI=0 \
  GCC_HOST_COMPILER_PATH=$(which gcc) \
  CC_OPT_FLAGS="-march=native" \
  TF_SET_ANDROID_WORKSPACE=0 \
  ./configure \
  && bazel build --config=opt --config=cuda --config=noaws \
   --config=nogcp --local_ram_resources=2048 //tensorflow/tools/pip_package:build_pip_package \
  && bazel-bin/tensorflow/tools/pip_package/build_pip_package wheel/tensorflow_pkg

RUN python3 -m pip install ~/tensorflow/wheel/tensorflow_pkg/tensorflow*.whl

I would really appreciate your insight on possible solutions to build tf2.3. Normally I would just go with pre-built version, but apparently there is a bug in one of the experimental features of that version.

myaldiz avatar Jun 28 '20 09:06 myaldiz

When it says 'killed', that most often means out of memory.

Can you try mounting swap and keeping an eye on the memory usage?


From: Mustafa Berk YALDIZ [email protected] Sent: Sunday, June 28, 2020 5:22:47 AM To: dusty-nv/jetson-containers [email protected] Cc: Dustin Franklin [email protected]; Mention [email protected] Subject: Re: [dusty-nv/jetson-containers] cudnn + tensorrt container (#3)

Hi @myaldizhttps://github.com/myaldiz, try adding this command to your dockerfile:

RUN ln -s /usr/include/aarch64-linux-gnu/cudnn_v8.h /usr/include/cudnn.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_version_v8.h /usr/include/cudnn_version.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_backend_v8.h /usr/include/cudnn_backend.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_infer_v8.h /usr/include/cudnn_adv_infer.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_train_v8.h /usr/include/cudnn_adv_train.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_infer_v8.h /usr/include/cudnn_cnn_infer.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_train_v8.h /usr/include/cudnn_cnn_train.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_infer_v8.h /usr/include/cudnn_ops_infer.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_train_v8.h /usr/include/cudnn_ops_train.h &&
ls -ll /usr/include/cudnn*

Thank you so much! That solved the issue, but I am having another compilation problem now:

ERROR: /root/tensorflow/tensorflow/core/kernels/BUILD:6109:1: C++ compilation of rule '//tensorflow/core/kernels:training_ops' failed (Exit 1) aarch64-linux-gnu-gcc-8: fatal error: Killed signal terminated program cc1plus compilation terminated. Target //tensorflow/tools/pip_package:build_pip_package failed to build

I tried gcc-8 for compilation, but it did not help. Here is the docker file I try to build (please ignore unnecessary apt-get's):

FROM nvcr.io/nvidia/l4t-base:r32.4.2

RUN export DEBIAN_FRONTEND=noninteractive
&& apt-get update
&& apt-get upgrade -y
&& apt-get install -y
vim git wget cmake build-essential libssl-dev
g++ gphoto2 libgphoto2-dev doxygen
libbz2-dev unzip libpng-dev
libtbb2 libtbb-dev libtiff5-dev libv4l-dev
libicu-dev autotools-dev
libhdf5-serial-dev hdf5-tools
libhdf5-dev zlib1g-dev zip libjpeg8-dev
liblapack-dev libblas-dev gfortran
libavcodec-dev libavformat-dev libavutil-dev
libeigen3-dev libglew-dev libgtk2.0-dev
libgtk-3-dev libjpeg-dev libpostproc-dev
libswscale-dev libxvidcore-dev libx264-dev
qt5-default pkg-config openjdk-11-jdk
&& apt-get install -y
python3-dev python3-pip python3-numpy python3-py
python3-setuptools python3-pytest python3-opencv
&& apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev
&& apt-get autoclean && apt-get clean
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

RUN python3 -m pip install --upgrade pip
&& python3 -m pip install -U pip testresources setuptools
&& python3 -m pip install -U
'numpy<1.19.0' future mock
h5py keras_preprocessing
keras_applications gast futures
protobuf pybind11 jupyter tqdm matplotlib
&& python3 -m pip install -U six wheel setuptools mock

RUN cd ~
&& wget https://github.com/bazelbuild/bazel/releases/download/3.1.0/bazel-3.1.0-dist.zip
&& unzip bazel-3.1.0-dist.zip -d bazel
&& cd bazel
&& EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" ./compile.sh
&& cp output/bazel /usr/local/bin

RUN ln -s /usr/include/aarch64-linux-gnu/cudnn_v8.h /usr/include/cudnn.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_version_v8.h /usr/include/cudnn_version.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_backend_v8.h /usr/include/cudnn_backend.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_infer_v8.h /usr/include/cudnn_adv_infer.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_train_v8.h /usr/include/cudnn_adv_train.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_infer_v8.h /usr/include/cudnn_cnn_infer.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_train_v8.h /usr/include/cudnn_cnn_train.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_infer_v8.h /usr/include/cudnn_ops_infer.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_train_v8.h /usr/include/cudnn_ops_train.h &&
ls -ll /usr/include/cudnn*
&& sh -c "echo '/usr/local/cuda/lib64' >> /etc/ld.so.conf.d/nvidia.conf"
&& ldconfig

RUN cd ~
&& git clone --recursive --branch "v2.3.0-rc0" https://github.com/tensorflow/tensorflow.git

RUN cd ~/tensorflow
&& PYTHON_BIN_PATH=$(which python3)
PYTHON_LIB_PATH=$(python3 -c 'import site; print(site.getsitepackages()[0])')
TF_NEED_OPENCL_SYCL=0
TF_NEED_OPENCL=0
TF_NEED_ROCM=0
TF_NEED_CUDA=1
TF_NEED_TENSORRT=1
TF_CUDA_VERSION=10.2
TF_TENSORRT_VERSION=7
CUDA_TOOLKIT_PATH=/usr/local/cuda
CUDNN_INSTALL_PATH=/usr/lib/aarch64-linux-gnu
TENSORRT_INSTALL_PATH=/usr/lib/aarch64-linux-gnu
TF_CUDA_COMPUTE_CAPABILITIES=5.3
TF_CUDA_CLANG=0
TF_NEED_IGNITE=0
TF_ENABLE_XLA=0
TF_NEED_MPI=0
GCC_HOST_COMPILER_PATH=$(which gcc)
CC_OPT_FLAGS="-march=native"
TF_SET_ANDROID_WORKSPACE=0
./configure
&& bazel build --config=opt --config=cuda --config=noaws
--config=nogcp --local_ram_resources=2048 //tensorflow/tools/pip_package:build_pip_package
&& bazel-bin/tensorflow/tools/pip_package/build_pip_package wheel/tensorflow_pkg

RUN python3 -m pip install ~/tensorflow/wheel/tensorflow_pkg/tensorflow*.whl

I would really appreciate your insight on possible solutions to build this.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/dusty-nv/jetson-containers/issues/3#issuecomment-650723626, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADVEGKZKWMORXBMNL5IFN2LRY4DWPANCNFSM4M4SPPUA.

dusty-nv avatar Jun 28 '20 13:06 dusty-nv

When it says 'killed', that most often means out of memory. Can you try mounting swap and keeping an eye on the memory usage?

I added 8GB swap and removed --local_ram_resources option and it printed successfully build message, until I saw out of disk space message :), gotta build again with larger SD card memory. It took around 32 hours for me to see that message. And from time to time it allocated almost 10GB of memory. Thanks for your advices!

myaldiz avatar Jun 30 '20 12:06 myaldiz

Hi

Hi @dusty-nv , try adding this command to your dockerfile:

RUN ln -s /usr/include/aarch64-linux-gnu/cudnn_v8.h /usr/include/cudnn.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_version_v8.h /usr/include/cudnn_version.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_backend_v8.h /usr/include/cudnn_backend.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_infer_v8.h /usr/include/cudnn_adv_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_train_v8.h /usr/include/cudnn_adv_train.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_infer_v8.h /usr/include/cudnn_cnn_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_train_v8.h /usr/include/cudnn_cnn_train.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_infer_v8.h /usr/include/cudnn_ops_infer.h && \
    ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_train_v8.h /usr/include/cudnn_ops_train.h && \
    ls -ll /usr/include/cudnn*

Hi. I add this in my dockerfile. But /usr/finclude/cudnn.h can not found.

yeyanan93 avatar Aug 01 '22 09:08 yeyanan93

FROM nvcr.io/nvidia/l4t-base:r32.7.1

ARG DEBIAN_FRONTEND=noninteractive WORKDIR /opt

RUN apt-get update && apt install -y
build-essential
cmake
git
pkg-config
libgtk-3-dev
libavcodec-dev
libswscale-dev
libv4l-dev
libxvidcore-dev
libx264-dev
libjpeg-dev
libpng-dev
libtiff-dev
gfortran
openexr
libatlas-base-dev
python3-dev
python3-numpy
libtbb2
libtbb-dev
libdc1394-22-dev
libopenexr-dev
libgstreamer-plugins-base1.0-dev
libgstreamer1.0-dev

RUN ln -s /usr/include/aarch64-linux-gnu/cudnn_v8.h /usr/include/cudnn.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_version_v8.h /usr/include/cudnn_version.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_backend_v8.h /usr/include/cudnn_backend.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_infer_v8.h /usr/include/cudnn_adv_infer.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_adv_train_v8.h /usr/include/cudnn_adv_train.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_infer_v8.h /usr/include/cudnn_cnn_infer.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_cnn_train_v8.h /usr/include/cudnn_cnn_train.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_infer_v8.h /usr/include/cudnn_ops_infer.h &&
ln -s /usr/include/aarch64-linux-gnu/cudnn_ops_train_v8.h /usr/include/cudnn_ops_train.h &&
ls -ll /usr/include/cudnn* &&
pwd

ADD opencv_build opencv_build

RUN ls -ll /usr/include/cudnn* RUN ls /usr/include/cudnn* RUN cat /usr/include/cudnn.h RUN cd opencv_build &&
cd opencv &&
mkdir build &&
cd build &&
cmake -D CMAKE_BUILD_TYPE=RELEASE
-D CMAKE_INSTALL_PREFIX=/usr/local
-D INSTALL_C_EXAMPLES=ON
-D INSTALL_PYTHON_EXAMPLES=ON
-D OPENCV_GENERATE_PKGCONFIG=ON
-D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules
-D BUILD_EXAMPLES=ON
-D CUDA_cublas_LIBRARY=/usr/lib/aarch64-linux-gnu/libcublas.so
-D CUDA_cufft_LIBRARY=/usr/local/cuda/lib64/libcufft.so
-D CUDA_nppc_LIBRARY=/usr/local/cuda/lib64/libnppc.so
-D CUDA_nppial_LIBRARY=/usr/local/cuda/lib64/libnppial.so
-D CUDA_nppicc_LIBRARY=/usr/local/cuda/lib64/libnppicc.so
-D CUDA_nppicom_LIBRARY=/usr/local/cuda/lib64/libnppicom.so
-D CUDA_nppidei_LIBRARY=/usr/local/cuda/lib64/libnppidei.so
-D CUDA_nppif_LIBRARY=/usr/local/cuda/lib64/libnppif.so
-D CUDA_nppig_LIBRARY=/usr/local/cuda/lib64/libnppig.so
-D CUDA_nppim_LIBRARY=/usr/local/cuda/lib64/libnppim.so
-D CUDA_nppist_LIBRARY=/usr/local/cuda/lib64/libnppist.so
-D CUDA_nppisu_LIBRARY=/usr/local/cuda/lib64/libnppisu.so
-D CUDA_nppitc_LIBRARY=/usr/local/cuda/lib64/libnppitc.so
-D CUDA_npps_LIBRARY=/usr/local/cuda/lib64/libnpps.so
-D WITH_CUDA=ON
-D WITH_CUDNN=ON
-D CUDNN_VERSION='8.0'
-D CUDNN_LIBRARY=/usr/lib/aarch64-linux-gnu/libcudnn.so
-D CUDNN_INCLUDE_DIR="/usr/include/" ..\

yeyanan93 avatar Aug 01 '22 09:08 yeyanan93

Hi @yeyanan93, what does RUN ls -ll /usr/include/aarch64-linux-gnu/cudnn* show within your dockerfile?

dusty-nv avatar Aug 01 '22 13:08 dusty-nv

Hi @yeyanan93, what does RUN ls -ll /usr/include/aarch64-linux-gnu/cudnn* show within your dockerfile?

lrwxrwxrwx 1 root root 41 Aug 2 09:47 /usr/include/cudnn.h -> /usr/include/aarch64-linux-gnu/cudnn_v8.h lrwxrwxrwx 1 root root 51 Aug 2 09:47 /usr/include/cudnn_adv_infer.h -> /usr/include/aarch64-linux-gnu/cudnn_adv_infer_v8.h lrwxrwxrwx 1 root root 51 Aug 2 09:47 /usr/include/cudnn_adv_train.h -> /usr/include/aarch64-linux-gnu/cudnn_adv_train_v8.h lrwxrwxrwx 1 root root 49 Aug 2 09:47 /usr/include/cudnn_backend.h -> /usr/include/aarch64-linux-gnu/cudnn_backend_v8.h lrwxrwxrwx 1 root root 51 Aug 2 09:47 /usr/include/cudnn_cnn_infer.h -> /usr/include/aarch64-linux-gnu/cudnn_cnn_infer_v8.h lrwxrwxrwx 1 root root 51 Aug 2 09:47 /usr/include/cudnn_cnn_train.h -> /usr/include/aarch64-linux-gnu/cudnn_cnn_train_v8.h lrwxrwxrwx 1 root root 51 Aug 2 09:47 /usr/include/cudnn_ops_infer.h -> /usr/include/aarch64-linux-gnu/cudnn_ops_infer_v8.h lrwxrwxrwx 1 root root 51 Aug 2 09:47 /usr/include/cudnn_ops_train.h -> /usr/include/aarch64-linux-gnu/cudnn_ops_train_v8.h lrwxrwxrwx 1 root root 49 Aug 2 09:47 /usr/include/cudnn_version.h -> /usr/include/aarch64-linux-gnu/cudnn_version_v8.h

yeyanan93 avatar Aug 02 '22 09:08 yeyanan93

I think the /usr/include/aarch64-linux-gnu/cudnn_adv_infer_v8.h file does not exist in reality.

Although ls can show it, but I can not use "$ cat /usr/include/aarch64-linux-gnu/cudnn_adv_infer_v8.h" to open it.

I use the following to solve this issue

ADD cudnn cudnn RUN cd cudnn &&
dpkg -i libcudnn8_8.0.0.180-1+cuda10.2_arm64.deb &&
dpkg -i libcudnn8-dev_8.0.0.180-1+cuda10.2_arm64.deb &&
dpkg -i libcudnn8-doc_8.0.0.180-1+cuda10.2_arm64.deb

yeyanan93 avatar Aug 02 '22 09:08 yeyanan93

Have you set the default docker runtime to nvidia?

https://github.com/dusty-nv/jetson-containers#docker-default-runtime

This will mount those files into the container during docker build operations.

dusty-nv avatar Aug 02 '22 12:08 dusty-nv