Tensorflow-TensorRT read

Dear author,

It's a great project, and the result is good!

but when I ran yolov3 with TensorRT on TX2, it took a long time (about 10~20 mins) to run read_pb_return_tensors(). Is this right? I'm wondering whether I did something wrong ...

Thanks

Jan 19 '19 08:01 fugjo16

Hi,

Do you: (i) run all the block 2 code of this code file, or (ii) only run function read_pb_graph("./model/YOLOv3/yolov3_gpu_nms.pb? If (i), yes, it takes longer time since you also perform TensorRT optimization. But, later after you store the trt_model.pb, you can just do similar to (ii) to call your stored trt_model.pb, and it only takes few seconds (also depends on your GPU). By the way, can you provide how much improvement in term of FPS after TRT optimization? And also what GPU you use. I am also curious about that.

Jan 19 '19 09:01 ardianumam

Hi @ardianumam ,

The situation is (ii), it will need about 15 mins to load the model, and I run this code on Jetson TX2, but with 3rd party carrier board. After the loading finished, the fps can be about 9 fps, and about 4 fps without TensorRT optimization. I think maybe the problem is caused by the 3rd party carrier board or different version of packages, I'll check it. Thanks for your reply.

Jan 21 '19 03:01 fugjo16

@fugjo16 : Do you convert the frozen_model.pb to TRT_model.pb in Desktop, then, you use it in Jetson TX2? Because I ever do the similar, and yes, it takes very long time even only to load the TRT_model.pb. And actually, such workflow is not proper, since TensorRT optimization generates an optimized model specifically for the machine we used to run the TensorRT optimization.

If not, I wonder that you can convert frozen_model.pb to TRT_model.pb in Jetson TX2, cz I ever try it several times and it always runs out memory. -.-

Jan 21 '19 03:01 ardianumam

@ardianumam. No, I convert to TRT_model.pb on the TX2, I use swap to get some more memory, as below. It's for CPU memory, but it still helped. https://devtalk.nvidia.com/default/topic/1025939/jetson-tx2/when-i-run-a-tensorflow-model-there-is-not-enough-memory-what-shoud-i-do-/ Maybe this is why I need so much time to load TRT_model ...

Jan 21 '19 05:01 fugjo16

@fugjo16 : I just knew about that. I'll try later in my TX2 too, and update here soon. Thanks. Yes, probably that's the cause.

Jan 21 '19 07:01 ardianumam

@ardianumam Thanks! this problem really confuse me a lot.

Jan 22 '19 01:01 fugjo16

Hi @fugjo16 : I just tried in my TX2, and yes, it took about 15 minutes to only read the <tensorrt_model.pb>, meanwhile reading the native tensorflow model <frozen_model>.pb needs only 5 seconds. lol. Maybe it due to the swap memory use when performing TensorRT optimization. I posted to NVIDIA forum too, hope someone replies. Or do you plan to, for example, reduce the YOLOv3 architecture so that we can perform tensorrt optimization in TX2 without making swap memory?

Jan 24 '19 03:01 ardianumam

Hi @ardianumam: Thanks a lot! hope someone will answer it. lol. Yes, I think this method will work, I will try it! Thanks :D

Jan 24 '19 09:01 fugjo16

I'd rather say you're hit by the protobuf version/backend. Check: https://devtalk.nvidia.com/default/topic/1046492/tensorrt/extremely-long-time-to-load-trt-optimized-frozen-tf-graphs/

and start with: export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp before running your code. If that doesn't help - update protobuf. I rebuilt it from sources.

Feb 18 '19 09:02 filipski

@filipski : thanks for the info. I'll give it a try.

Feb 19 '19 03:02 ardianumam

I tested with this blog's script. it's easy to modify, and it works for me. https://jkjung-avt.github.io/tf-trt-revisited/

Jun 25 '19 05:06 fugjo16

@fugjo16 @ardianumam I have a yolov3 Tensorflow model in both ckpts and .pb format. My model can run in GTX 1080 Ti at 37 FPS . Now I want to run in Xavier NX but model is very slow. about 2 FPS. How I can optimize this model using trt to make it faster and run in Xavier NX? how I can convert .pb model to .trt engine?

Sep 18 '20 10:09 MuhammadAsadJaved

Tensorflow-TensorRT
Tensorflow-TensorRT copied to clipboard

read_pb is slow

Tensorflow-TensorRT Tensorflow-TensorRT copied to clipboard

read_pb is slow

Tensorflow-TensorRT
Tensorflow-TensorRT copied to clipboard