tensorflow-deeplab-v3-plus
tensorflow-deeplab-v3-plus copied to clipboard
The inference result not so well in Jetson TX2
Dear Rishizek, I am using Nvidia TX2 for a inferencing try. It worked. But the result not good enough. Do you have any clue for possible reasons? Thanks for checking.
Informations:
nvidia@tegra-ubuntu$ uname -r
4.4.38-tegra
nvidia@tegra-ubuntu$ cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04.4 LTS (Xenial Xerus)"
nvidia@tegra-ubuntu:~/dev/github/tensorflow-deeplab-v3-plus$ python3 inference.py --data_dir '/home/nvidia/dev/dev_dataset/data_dir' --infer_data_list '/home/nvidia/dev/dev_dataset/data_dir/imagelist.txt' --model_dir '/home/nvidia/dev/pretrainedmodels/deeplabv3plus_ver1' --output_dir '/home/nvidia/dev/dev_dataset/data_dir'
2018-04-20 03:37:35.811927: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] ARM64 does not support NUMA - returning NUMA node zero
2018-04-20 03:37:35.812095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005
pciBusID: 0000:00:00.0
totalMemory: 7.66GiB freeMemory: 5.99GiB
2018-04-20 03:37:35.812147: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-20 03:37:37.122109: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-20 03:37:37.122239: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0
2018-04-20 03:37:37.122271: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N
2018-04-20 03:37:37.122503: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5442 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_num_ps_replicas': 0, '_save_summary_steps': 100, '_keep_checkpoint_every_n_hours': 10000, '_master': '', '_tf_random_seed': None, '_task_id': 0, '_save_checkpoints_steps': None, '_model_dir': '/home/nvidia/dev/pretrainedmodels/deeplabv3plus_ver1', '_log_step_count_steps': 100, '_keep_checkpoint_max': 5, '_evaluation_master': '', '_task_type': 'worker', '_is_chief': True, '_global_id_in_cluster': 0, '_num_worker_replicas': 1, '_session_config': None, '_service': None, '_save_checkpoints_secs': 600, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f6c03e710>}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
2018-04-20 03:38:08.699511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-20 03:38:08.699640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-20 03:38:08.699675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0
2018-04-20 03:38:08.699729: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N
2018-04-20 03:38:08.699834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5442 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
INFO:tensorflow:Restoring parameters from /home/nvidia/dev/pretrainedmodels/deeplabv3plus_ver1/model.ckpt-30358
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
2018-04-20 03:38:32.594527: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
generating: /home/nvidia/dev/dev_dataset/data_dir/2007_000129_mask.png
model is big
发自我的 iPhone
在 2018年4月20日,上午11:48,Robin [email protected] 写道:
Dear Rishizek, I am using Nvidia TX2 for a inferencing try. It worked. But the result not good enough. Do you have any clue for possible reasons? Thanks for checking.
Informations:
nvidia@tegra-ubuntu$ uname -r 4.4.38-tegra nvidia@tegra-ubuntu$ cat /etc/os-release NAME="Ubuntu" VERSION="16.04.4 LTS (Xenial Xerus)" nvidia@tegra-ubuntu:~/dev/github/tensorflow-deeplab-v3-plus$ python3 inference.py --data_dir '/home/nvidia/dev/dev_dataset/data_dir' --infer_data_list '/home/nvidia/dev/dev_dataset/data_dir/imagelist.txt' --model_dir '/home/nvidia/dev/pretrainedmodels/deeplabv3plus_ver1' --output_dir '/home/nvidia/dev/dev_dataset/data_dir' 2018-04-20 03:37:35.811927: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] ARM64 does not support NUMA - returning NUMA node zero 2018-04-20 03:37:35.812095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005 pciBusID: 0000:00:00.0 totalMemory: 7.66GiB freeMemory: 5.99GiB 2018-04-20 03:37:35.812147: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0 2018-04-20 03:37:37.122109: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-04-20 03:37:37.122239: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0 2018-04-20 03:37:37.122271: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N 2018-04-20 03:37:37.122503: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5442 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2) INFO:tensorflow:Using default config. INFO:tensorflow:Using config: {'_num_ps_replicas': 0, '_save_summary_steps': 100, '_keep_checkpoint_every_n_hours': 10000, '_master': '', '_tf_random_seed': None, '_task_id': 0, '_save_checkpoints_steps': None, '_model_dir': '/home/nvidia/dev/pretrainedmodels/deeplabv3plus_ver1', '_log_step_count_steps': 100, '_keep_checkpoint_max': 5, '_evaluation_master': '', '_task_type': 'worker', '_is_chief': True, '_global_id_in_cluster': 0, '_num_worker_replicas': 1, '_session_config': None, '_service': None, '_save_checkpoints_secs': 600, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f6c03e710>} INFO:tensorflow:Calling model_fn. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Graph was finalized. 2018-04-20 03:38:08.699511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0 2018-04-20 03:38:08.699640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-04-20 03:38:08.699675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0 2018-04-20 03:38:08.699729: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N 2018-04-20 03:38:08.699834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5442 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2) INFO:tensorflow:Restoring parameters from /home/nvidia/dev/pretrainedmodels/deeplabv3plus_ver1/model.ckpt-30358 INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. 2018-04-20 03:38:32.594527: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY generating: /home/nvidia/dev/dev_dataset/data_dir/2007_000129_mask.png
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
With running the same command with Titan GPU, it works well and as below. It consumes calculation power...
Hi @newip , thank you for your interest in the repo. I never run the model with Jetson TX2, so I don't have concrete answer. The inference should work even with CPU on regular PC. The following is my guesses, which may be incorrect:
- Input image size is differ from trained image size. You may refer to here for detail.
- Input images are converted from standard JPG file to something else before the model.
- Maybe Jeston TX2 quantizes weight for increasing inference performance and the IoU decreased, although I'm not sure Jetson TX2 indeed quantizes weight.
- Lack of GPU memory affected the model performance. Maybe you can resize the input image smaller.
I hope this helps solve your problem.
Thanks @rishizek , I will make a try for smaller size jpg file.