TensorRT Read the output shape error in profile from the engine file

Description

I used the trtexec and polygraphy commands to export from onnx to trt, and the model output was fine, but the engine file output shape I read in the code did not match that in the log.

Environment

Use a mirror nvidia/cuda:12.1.0-devel-ubuntu20.04.

TensorRT Version: 10.0.1.6

NVIDIA GPU: L4

NVIDIA Driver Version: 535.129.03

CUDA Version: 12.1.0

CUDNN Version:

Operating System: ubuntu20.04

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link: https://drive.google.com/file/d/197HgyO3Dwua89QERaDfYIx-hNhNpUKJJ/view?usp=drive_link

Steps To Reproduce

Commands or scripts: polygraphy run ./weights/parseq_ar_decoder_bs1_smi.onnx --onnxrt --trt --pool-limit workspace:1G --save-engine=./weights/parseq_ar_decoder_bs1_smi.trt --trt-min-shapes tgt:[1,1] --trt-opt-shapes tgt:[1,41] --trt-max-shapes tgt:[1,251] --atol 1e-3 --rtol 1e-3 --verbose /TensorRT-10.0.1.6/bin/trtexec --onnx=./weights/parseq_ar_step_decoder_bs1_smi.onnx --saveEngine=./weights/parseq_ar_step_decoder_bs1_smi.trt --minShapes=tgt:1x1 --optShapes=tgt:1x1 --maxShapes=tgt:1x251

Have you tried the latest release?: yes.

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): I used python and onnx inference and the results were normal.

Use the code in the link below to allocate memory. https://github.com/NVIDIA/TensorRT/blob/6d2fa4df7fa3e3f4bf5dd27586ba053b4ae57cd5/samples/python/common_runtime.py#L92 polygraphy

trtexec

In the code:

The source code is the decoder part of parseq, and I refactored it to use only the ar part of it (decode_ar=True, refine_iters=False). https://github.com/baudm/parseq/blob/1902db043c029a7e03a3818c616c06600af574be/strhub/models/parseq/model.py#L86

May 23 '24 07:05 PhilCuriosity

Your model is dynamic shape, when @ infer phase, you should set real shape of each dynamic inputs.

May 26 '24 10:05 lix19937

Your model is dynamic shape, when @ infer phase, you should set real shape of each dynamic inputs.

Thank you for your reply. In this model, I only used the ar version of parseq. Part of the input from the decoder I put in init(), while reasoning, reads the self variable, so there's only one dynamic input, tgt, and tgt_out is the network output.

May 27 '24 01:05 PhilCuriosity