opendr icon indicating copy to clipboard operation
opendr copied to clipboard

Tx2 install

Open Pavlos-Tosidis opened this issue 3 years ago • 24 comments

TX-2 installation script. Updated installation.md and fixed some import statements in object_detection_2d/detr and object_detection_2d/gem

Pavlos-Tosidis avatar Jan 10 '22 09:01 Pavlos-Tosidis

I don't have a TX2 to test this, can somebody else validate that the script works? @thomaspeyrucain maybe?

ad-daniel avatar Jan 11 '22 07:01 ad-daniel

I don't have a TX2 to test this, can somebody else validate that the script works? @thomaspeyrucain maybe?

@ad-daniel Ok I will test it on a docker on our Jetson

thomaspeyrucain avatar Jan 13 '22 12:01 thomaspeyrucain

Just to be sure I wanted to check whether this dockerfile was affected by the nvidia key rotation but it seems that it isn't possible to build the image without arm system (or at least, I wasn't able to despite passing --platform linux/arm64 when building), so I haven't been able to test. Has anybody managed to build successfully the image the past 2-4 days? Or did you encounter a key issue?

ad-daniel avatar May 11 '22 06:05 ad-daniel

Just to be sure I wanted to check whether this dockerfile was affected by the nvidia key rotation but it seems that it isn't possible to build the image without arm system (or at least, I wasn't able to despite passing --platform linux/arm64 when building), so I haven't been able to test. Has anybody managed to build successfully the image the past 2-4 days? Or did you encounter a key issue?

Since the last build on our side happened last week, I am running the docker build again now. Will update when it is completed.

Pavlos-Tosidis avatar May 11 '22 10:05 Pavlos-Tosidis

Just to be sure I wanted to check whether this dockerfile was affected by the nvidia key rotation but it seems that it isn't possible to build the image without arm system (or at least, I wasn't able to despite passing --platform linux/arm64 when building), so I haven't been able to test. Has anybody managed to build successfully the image the past 2-4 days? Or did you encounter a key issue?

Just built the docker image and it is working.

Pavlos-Tosidis avatar May 11 '22 17:05 Pavlos-Tosidis

@Pavlos-Tosidis Just a last question, did you build the images on your machine or the TX2? Because if like me you are unable to build it on a amd64 platform, then it means we won't be able to have the CI create the image and publish it automatically (as we do for the other docker images)

ad-daniel avatar May 16 '22 12:05 ad-daniel

It showed some errors during the process but it finished building the image OpenDR_docker_build_NX.txt

thomaspeyrucain avatar May 16 '22 15:05 thomaspeyrucain

I tried to install packages for running the ROS nodes : sudo apt-get install ros-noetic-vision-msgs ros-noetic-geometry-msgs ros-noetic-sensor-msgs ros-noetic-audio-common-msgs But it was not able to find the packages so I cloned the repositories and then built the ws

thomaspeyrucain avatar May 16 '22 15:05 thomaspeyrucain

Looks like some packages were not well installed or is missing a library :

root@98fb18d18058:/opendr# python3
Python 3.6.9 (default, Mar 15 2022, 13:55:28) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opendr/mxnet/python/mxnet/__init__.py", line 23, in <module>
    from .context import Context, current_context, cpu, gpu, cpu_pinned
  File "/opendr/mxnet/python/mxnet/context.py", line 23, in <module>
    from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass
  File "/opendr/mxnet/python/mxnet/base.py", line 351, in <module>
    _LIB = _load_lib()
  File "/opendr/mxnet/python/mxnet/base.py", line 341, in _load_lib
    lib_path = libinfo.find_lib_path()
  File "/opendr/mxnet/python/mxnet/libinfo.py", line 73, in find_lib_path
    'List of candidates:\n' + str('\n'.join(dll_path)))
RuntimeError: Cannot find the MXNet library.
List of candidates:
/opendr/lib/libmxnet.so
/opt/ros/noetic/lib/libmxnet.so
/usr/local/cuda-10.2/targets/aarch64-linux/lib/libmxnet.so
libmxnet.so
/opendr/mxnet/python/mxnet/libmxnet.so
/opendr/mxnet/python/mxnet/../../lib/libmxnet.so
/opendr/mxnet/python/mxnet/../../build/libmxnet.so
../../../libmxnet.so
>>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 189, in <module>
    _load_global_deps()
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 142, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory

thomaspeyrucain avatar May 16 '22 15:05 thomaspeyrucain

Looks like some packages were not well installed or is missing a library :

root@98fb18d18058:/opendr# python3
Python 3.6.9 (default, Mar 15 2022, 13:55:28) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opendr/mxnet/python/mxnet/__init__.py", line 23, in <module>
    from .context import Context, current_context, cpu, gpu, cpu_pinned
  File "/opendr/mxnet/python/mxnet/context.py", line 23, in <module>
    from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass
  File "/opendr/mxnet/python/mxnet/base.py", line 351, in <module>
    _LIB = _load_lib()
  File "/opendr/mxnet/python/mxnet/base.py", line 341, in _load_lib
    lib_path = libinfo.find_lib_path()
  File "/opendr/mxnet/python/mxnet/libinfo.py", line 73, in find_lib_path
    'List of candidates:\n' + str('\n'.join(dll_path)))
RuntimeError: Cannot find the MXNet library.
List of candidates:
/opendr/lib/libmxnet.so
/opt/ros/noetic/lib/libmxnet.so
/usr/local/cuda-10.2/targets/aarch64-linux/lib/libmxnet.so
libmxnet.so
/opendr/mxnet/python/mxnet/libmxnet.so
/opendr/mxnet/python/mxnet/../../lib/libmxnet.so
/opendr/mxnet/python/mxnet/../../build/libmxnet.so
../../../libmxnet.so
>>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 189, in <module>
    _load_global_deps()
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 142, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory

Dear Thomas, Did you follow the instructions for building the docker image correctly? This happened to me when I didn't edit the '/etc/docker/daemon.json' file correctly. Additionaly the script builds the ros packages as well, so there is no need to install further packages.

Pavlos-Tosidis avatar May 16 '22 16:05 Pavlos-Tosidis

@Pavlos-Tosidis I discussed this with my colleague and we checked the log, it is missing some things inside the docker like cudnn.h and libcurand.so.10 could you please check that everything is correctly installed in your docker?

thomaspeyrucain avatar May 20 '22 10:05 thomaspeyrucain

@Pavlos-Tosidis I discussed this with my colleague and we checked the log, it is missing some things inside the docker like cudnn.h and libcurand.so.10 could you please check that everything is correctly installed in your docker?

@thomaspeyrucain as we discussed, I have flashed and installed the docker image multiple times on a TX2, with the results being the same. I didn't try the ROS scripts, but the python scripts run successfully.

Pavlos-Tosidis avatar May 20 '22 11:05 Pavlos-Tosidis

Did you try on any NX boards?

thomaspeyrucain avatar May 20 '22 11:05 thomaspeyrucain

Do you build the docker image from scratch ? This base image : nvcr.io/nvidia/l4t-base:r32.6.1 Does not contain the cudnn.h and the corresponding libraries

thomaspeyrucain avatar May 20 '22 11:05 thomaspeyrucain

Do you build the docker image from scratch ? This base image : nvcr.io/nvidia/l4t-base:r32.6.1 Does not contain the cudnn.h and the corresponding libraries

Do you run the container with the proper arguments? -it --privileged

Pavlos-Tosidis avatar May 20 '22 11:05 Pavlos-Tosidis

The errors are occurring when building the docker from the Dockerfile Those arguments -it --privileged is when you run the docker image

thomaspeyrucain avatar May 20 '22 11:05 thomaspeyrucain

The errors you sent me so far are while trying to import torch/mxnet. There are a lot of warnings when building mxnet, but it gets installed and working as intended.

Pavlos-Tosidis avatar May 20 '22 11:05 Pavlos-Tosidis

It showed some errors during the process but it finished building the image OpenDR_docker_build_NX.txt

If you check the logfile errors are appearing when building the docker image

thomaspeyrucain avatar May 20 '22 11:05 thomaspeyrucain

It showed some errors during the process but it finished building the image OpenDR_docker_build_NX.txt

If you check the logfile errors are appearing when building the docker image

Since it produces the docker image in the end. Can you please try and run for example (after activating opendr): python3 projects/perception/face_recognition/demos/infrence_demo.py

Pavlos-Tosidis avatar May 20 '22 11:05 Pavlos-Tosidis

@thomaspeyrucain Just finished the docker image installation on an Nvidia NX. Everything run smoothly and tried face_recognition and Retinaface inference demos for testing. Both run as intended.

Pavlos-Tosidis avatar May 24 '22 19:05 Pavlos-Tosidis

@Pavlos-Tosidis Nice, Did you succeed to run the ROS nodes without issues inside the docker?

thomaspeyrucain avatar May 25 '22 07:05 thomaspeyrucain

NX docker image is up on docker hub. you can download and run it with: xhost +local:root sudo docker run -it --privileged -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=unix$DISPLAY opendr/opendr-toolkit:nx /bin/bash

Fixed a couple of ROS issues. If the NX is correctly flashed (with Cuda and nvidia runtime) it works out of the box

Pavlos-Tosidis avatar May 25 '22 15:05 Pavlos-Tosidis

I pulled the image and tested it on the NX board and it works for the face recognition demo script What ROS issues did you fix?

thomaspeyrucain avatar May 26 '22 14:05 thomaspeyrucain

The broken tests on macos can be ignored, cppcheck was upgraded and it's being fixed on a separate PR

ad-daniel avatar Jun 07 '22 08:06 ad-daniel

https://github.com/opendr-eu/opendr/pull/360 is now included in develop

ad-daniel avatar Nov 30 '22 12:11 ad-daniel

https://github.com/opendr-eu/opendr/pull/317 is now included in develop

ad-daniel avatar Dec 08 '22 10:12 ad-daniel

Replaced by #384

tsampazk avatar Dec 20 '22 09:12 tsampazk