Read the output shape error in profile from the engine file
Description
I used the trtexec and polygraphy commands to export from onnx to trt, and the model output was fine, but the engine file output shape I read in the code did not match that in the log.
Environment
Use a mirror nvidia/cuda:12.1.0-devel-ubuntu20.04.
TensorRT Version: 10.0.1.6
NVIDIA GPU: L4
NVIDIA Driver Version: 535.129.03
CUDA Version: 12.1.0
CUDNN Version:
Operating System: ubuntu20.04
Python Version (if applicable):
Tensorflow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if so, version):
Relevant Files
Model link: https://drive.google.com/file/d/197HgyO3Dwua89QERaDfYIx-hNhNpUKJJ/view?usp=drive_link
Steps To Reproduce
Commands or scripts: polygraphy run ./weights/parseq_ar_decoder_bs1_smi.onnx --onnxrt --trt --pool-limit workspace:1G --save-engine=./weights/parseq_ar_decoder_bs1_smi.trt --trt-min-shapes tgt:[1,1] --trt-opt-shapes tgt:[1,41] --trt-max-shapes tgt:[1,251] --atol 1e-3 --rtol 1e-3 --verbose /TensorRT-10.0.1.6/bin/trtexec --onnx=./weights/parseq_ar_step_decoder_bs1_smi.onnx --saveEngine=./weights/parseq_ar_step_decoder_bs1_smi.trt --minShapes=tgt:1x1 --optShapes=tgt:1x1 --maxShapes=tgt:1x251
Have you tried the latest release?: yes.
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):
I used python and onnx inference and the results were normal.
Use the code in the link below to allocate memory.
https://github.com/NVIDIA/TensorRT/blob/6d2fa4df7fa3e3f4bf5dd27586ba053b4ae57cd5/samples/python/common_runtime.py#L92
polygraphy
trtexec
In the code:
The source code is the decoder part of parseq, and I refactored it to use only the ar part of it (decode_ar=True, refine_iters=False). https://github.com/baudm/parseq/blob/1902db043c029a7e03a3818c616c06600af574be/strhub/models/parseq/model.py#L86
Your model is dynamic shape, when @ infer phase, you should set real shape of each dynamic inputs.
Your model is dynamic shape, when @ infer phase, you should set real shape of each dynamic inputs.
Thank you for your reply. In this model, I only used the ar version of parseq. Part of the input from the decoder I put in init(), while reasoning, reads the self variable, so there's only one dynamic input, tgt, and tgt_out is the network output.