tensorflow-yolov4-tflite Aborted Error while running detect.py with yolov4-tiny-trt model in Jetson Nano

Hi @hunglc007 , I have converted the yolov4-tiny weights into tensorflow weights and then converted it into tensorrt model using your repo. Now when trying to run detect.py file with the compressed tensorrt model,on my system its working fine.But on Jetson Nano, the same file with same code is not working. Its getting aborted. I have converted the weights into tf on nano itself . Below is the error:

2020-09-28 19:38:17.401081: I tensorflow/compiler/tf2tensorrt/convert/convert_nodes.cc:1205] Loaded TensorRT version: 7.1.3
2020-09-28 19:38:17.445402: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libnvinfer_plugin.so.7
2020-09-28 19:41:16.717255: F tensorflow/core/kernels/resize_bilinear_op_gpu.cu.cc:493] Non-OK-status: GpuLaunchKernel(kernel, config.block_count, config.thread_per_block, 0, d.stream(), config.virtual_thread_count, images.data(), height_scale, width_scale, batch, in_height, in_width, channels, out_height, out_width, output.data()) status: Internal: too many resources requested for launch
Fatal Python error: Aborted

Thread 0x0000007f996b9010 (most recent call first):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py", line 60 in quick_execute
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 598 in call
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1746 in _call_flat
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/load.py", line 101 in _call_flat
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1645 in _call_impl
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1605 in __call__
  File "detect.py", line 66 in main
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250 in _run_main
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299 in run
  File "detect.py", line 90 in <module>
Aborted

Your ideas on this?

Sep 23 '20 15:09 srikar242

Did you solve it? I meet similar problem.

Oct 28 '20 07:10 Accioy

Hello @Accioy In my case, it was just the memory issue on nano.After solving memory issue, I didnt get that error.

Oct 28 '20 17:10 srikar242

Hey @srikar242. How did you resolve the memory issue? Did you use a larger SD card than the 4GB one?

Dec 07 '20 11:12 parthjdoshi

@srikar242 how many fps did you get? Trying to see what I should aim for on mine

Dec 11 '20 21:12 bhaktatejas922

+1 @srikar242 could you explain how you resolved the memory issue please?

I'm doing exactly the same as you (using YOLOv4 in RT on a Jetson Nano), and having exactly the same problem.

I'm running headless, so the system is only using ~400MB, and I increased my swapfile to 16GB. I've also tried reducing the maximum working space and maximum batch sizes set at conversion (to 2GB and 1, respectively).

But, I still get the same too many resources requested for launcherror...

Dec 16 '20 12:12 pauljerem

So, I figured out from this post that the error relates to CUDA’s maximum number of threads per block being too large.

And, according to this response from Nvidia, config.block_count, which appears in the Tensorflow script tensorflow/tensorflow/core/kernels/resize_bilinear_op_gpu.cu.cc needs to be adjusted down (e.g. to 512).

But I don't understand how to set it... Anybody have any ideas?

Dec 18 '20 11:12 pauljerem

@bhaktatejas922 I got around 5 to 6 fps.

Dec 28 '20 12:12 srikar242

@arsenal-2004 It was something related to CUDA's number of threads per block issue. I followed some response on nvidia page to fix that. But I don't remember exactly how I did that as it was some months back and I switched to other topic.

Dec 28 '20 12:12 srikar242

I found a solution. The issue can be fixed by adding the following lines at the top of the detect script (before TensorFlow is imported):

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'

Jan 08 '21 11:01 pauljerem

Hi @pauljerem, I tried your solution and it works!

However, I realized that this method ends up not using the GPU of the Nano. I verified this by tf.test.is_gpu_available() and it returned False (without adding this solution, it returned True). I then tried testing this on another repo which didn't have this issue, and found that by adding this solution, the FPS slowed by about 3x because it was not using the GPU.

I'm hoping there's another solution that can both fix this issue and also allow the GPU to be used...

I found a solution. The issue can be fixed by adding the following lines at the top of the detect script (before TensorFlow is imported):
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'

Feb 17 '21 11:02 leeping-ng

Hi @pauljerem, I tried your solution and it works!

However, I realized that this method ends up not using the GPU of the Nano. I verified this by tf.test.is_gpu_available() and it returned False (without adding this solution, it returned True). I then tried testing this on another repo which didn't have this issue, and found that by adding this solution, the FPS slowed by about 3x because it was not using the GPU.

I'm hoping there's another solution that can both fix this issue and also allow the GPU to be used...
I found a solution. The issue can be fixed by adding the following lines at the top of the detect script (before TensorFlow is imported):
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'

Same hear, but changing 1 to 0 did the trick for me, so maybe run import os os.environ['CUDA_VISIBLE_DEVICES'] = '1'

Oct 05 '21 11:10 Watashi26

So, I figured out from this post that the error relates to CUDA’s maximum number of threads per block being too large.

And, according to this response from Nvidia, config.block_count, which appears in the Tensorflow script tensorflow/tensorflow/core/kernels/resize_bilinear_op_gpu.cu.cc needs to be adjusted down (e.g. to 512).

But I don't understand how to set it... Anybody have any ideas?

@pauljerem were you able to find a way to reduce the value down to 512?

Apr 19 '23 00:04 pedromarta

tensorflow-yolov4-tflite tensorflow-yolov4-tflite copied to clipboard

Aborted Error while running detect.py with yolov4-tiny-trt model in Jetson Nano

tensorflow-yolov4-tflite
tensorflow-yolov4-tflite copied to clipboard