tensorrt
tensorrt copied to clipboard
Running multiple TensorRT-optimized models in Tensorflow
I am working with the Tensorflow 2.0 project that uses multiple models for inference. Some of those models were optimized using TF-TRT.
I tried both regular offline conversion and offline conversion with engine serialization. In case of regular conversion TensorRT engine is rebuilt every time model execution context changes. While using models with serialized engines, I’m not able to load more than one TensorRT-optimized models.
My application uses single Session at runtime.
I am using nvcr.io/nvidia/tensorflow:19.12-tf2-py3 docker container to optimize models and run the application.
More about the issue in: https://stackoverflow.com/questions/60967867/running-multiple-tensorrt-optimized-models-in-tensorflow
What is the correct approach to run simultaneously multiple TensorRT-optimized models with pre-built engines using Tensorflow?
Is it a valid solution to use a separate Session for each of those models
Thanks for the detailed report. It is a valid use case to have multiple models with multiple pre-built engines. We seem to have a problem with the way the engines cached, we are working on this problem. This is related to Issue #195, we will continue the discussion there.
@tfeher I am also having a problem running two tensorRT optimized models. The inference is completed for the first network, but then in the second network, the errors I included below occur. Is this a similar issue or something completely different? I am using tf 2.1.0 and both models run properly when they are separated, however when I load both models in the same program and run inference sequentially the second model always fails with the cache size error.
2020-06-23 12:22:53.617659: W tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at trt_engine_op.cc:494 : Invalid argument: Input shape list size mismatch for PartitionedCall/TRTEngineOp_5, cached size: 6 vs. actual size: 1
2020-06-23 12:22:53.654311: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Input shape list size mismatch for PartitionedCall/TRTEngineOp_5, cached size: 6 vs. actual size: 1
[[{{node PartitionedCall/TRTEngineOp_5}}]]
Traceback (most recent call last):
File "live_inf.py", line 108, in
Function call stack: signature_wrapper
@anoushsepehri I am facing the same issue using multiple networks converted by tensorrt. Have you found any workaround?