jetson-containers
jetson-containers copied to clipboard
C++ Program crashed while running tensorflow 2.0.0 with cuda 10.0
I am using Jetpack 4.3 on Tegra Tx2. Below are other versions of 3rd party softwares.
Protobuf-3.8.0
Eigen- 3.3.90
Tensorflow-2.0.0
Python-2.7.17
GCC 7.5.0
Bazel 0.26.1
cuDNN-7.6.3
CUDA-10.0
I have compiled tensorflow-2.0.0 inside docker container (tf-base-container has only c++ and is based on ubuntu18.04). Below is the command to run container.
docker container run --privileged -e DISPLAY=$DISPLAY -v /tmp/X11-unix:/tmp/X11-unix -e PATH=:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin -e LD_LIBRARY_PATH=:/usr/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu/tegra:/usr/local/cuda/lib64:/usr/local/lib:/usr/lib:/lib -v /usr/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu -v /usr/local/cuda-10.0:/usr/local/cuda --net=host -v /root/disk-tx2/:/root/disk -v /dev:/dev -ti tf-base-container:latest /bin/bash
For compilation following command is used. this command successfully generated libtensorflow_cc.so.2.0.0
bazel build --config=opt --config=v2 --config=noaws --config=nohdfs --config=noignite --config=nokafka --config=monolithic --config=cuda --config=numa --verbose_failures //tensorflow:libtensorflow_cc.so
But when I tried running a sample program (inside the same container on Tx2) which is using the generated library libtensorflow_cc.so.2.0.0. Below is the error I am facing. $ ./object_detection obj_139.jpg frozen_inference_graph.pb label_map.pbtxt
Height 1200 Width 1920
labels path ../demo/asset-inference-graph/label_map.pbtxt
graph path ../demo/asset-inference-graph/frozen_inference_graph.pb
2020-07-07 04:02:50.339682: E /root/disk/tx2/object_detection_demo/main_buffer.cpp:393] graph_path:../demo/asset-inference-graph/frozen_inference_graph.pb
2020-07-07 04:02:50.442481: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-07 04:02:50.451361: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-07-07 04:02:50.451517: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3 pciBusID: 0000:00:00.0
2020-07-07 04:02:50.451562: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-07-07 04:02:50.451656: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-07-07 04:02:50.451809: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-07-07 04:02:50.451905: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-07-07 04:03:42.801867: E tensorflow/core/common_runtime/session.cc:78] Failed to create session: Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: unknown error
Segmentation fault (core dumped)
Appreciate if someone can help me in this. Let me know if more info is needed.