jetson-containers icon indicating copy to clipboard operation
jetson-containers copied to clipboard

C++ Program crashed while running tensorflow 2.0.0 with cuda 10.0

Open harendra247 opened this issue 5 years ago • 0 comments

I am using Jetpack 4.3 on Tegra Tx2. Below are other versions of 3rd party softwares.

Protobuf-3.8.0
Eigen- 3.3.90 
Tensorflow-2.0.0
Python-2.7.17
GCC 7.5.0
Bazel 0.26.1
cuDNN-7.6.3
CUDA-10.0

I have compiled tensorflow-2.0.0 inside docker container (tf-base-container has only c++ and is based on ubuntu18.04). Below is the command to run container.

docker container run --privileged -e DISPLAY=$DISPLAY -v /tmp/X11-unix:/tmp/X11-unix -e PATH=:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin -e LD_LIBRARY_PATH=:/usr/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu/tegra:/usr/local/cuda/lib64:/usr/local/lib:/usr/lib:/lib -v /usr/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu -v /usr/local/cuda-10.0:/usr/local/cuda --net=host -v /root/disk-tx2/:/root/disk -v /dev:/dev -ti tf-base-container:latest /bin/bash 

For compilation following command is used. this command successfully generated libtensorflow_cc.so.2.0.0

bazel build --config=opt --config=v2 --config=noaws --config=nohdfs --config=noignite --config=nokafka --config=monolithic --config=cuda --config=numa  --verbose_failures //tensorflow:libtensorflow_cc.so

But when I tried running a sample program (inside the same container on Tx2) which is using the generated library libtensorflow_cc.so.2.0.0. Below is the error I am facing. $ ./object_detection obj_139.jpg frozen_inference_graph.pb label_map.pbtxt

Height 1200 Width 1920                                                                                       
labels path ../demo/asset-inference-graph/label_map.pbtxt 
graph path  ../demo/asset-inference-graph/frozen_inference_graph.pb                                                                                               
2020-07-07 04:02:50.339682: E /root/disk/tx2/object_detection_demo/main_buffer.cpp:393] graph_path:../demo/asset-inference-graph/frozen_inference_graph.pb                                                                       
2020-07-07 04:02:50.442481: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1                                                                                                                  
2020-07-07 04:02:50.451361: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero                                                                                                             
2020-07-07 04:02:50.451517: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3 pciBusID: 0000:00:00.0                                                                                             
2020-07-07 04:02:50.451562: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.                                                                                           
2020-07-07 04:02:50.451656: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-07-07 04:02:50.451809: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-07-07 04:02:50.451905: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-07-07 04:03:42.801867: E tensorflow/core/common_runtime/session.cc:78] Failed to create session: Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: unknown error
Segmentation fault (core dumped)

Appreciate if someone can help me in this. Let me know if more info is needed.

harendra247 avatar Jul 07 '20 04:07 harendra247