Zero Zeng
Zero Zeng
Just `trtexec --onnx=folded.onnx --fp16 --int8`
@nvpohanh is it a bug?
How is the perf of ``` trtexec\ --onnx={os.path.join(SAVE_PATH, '.onnx')} \ --fp16 --int8 \ --minShapes=input:1x3x224x224 \ --optShapes=input:10x3x224x224 \ --maxShapes=input:64x3x224x224 \ --explicitBatch\ --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw \ --saveEngine={os.path.join(SAVE_PATH, '.trt')} ```
Could you please share the onnx here? If it's a QAT model, `--int8` should be required otherwise TRT will throw an error.
You may hit a known issue in TRT 8.6 and it's fixed in TRT 9.2. could you please try the latest TRT 9.2? you can download it from below link:...
You just hit a bug that fix in TRT 9.2 :-)
Could you please share the onnx that can reproduce this issue?
We didn't release it in the official docker image since it's a limited EA release. but you can build the docker by using https://github.com/NVIDIA/TensorRT/blob/release/9.2/docker/ubuntu-20.04.Dockerfile
> https://drive.google.com/drive/folders/1DPb0HigtNiI9Z8TCn7z0HTL0PYsPYwJ4?usp=sharing May I ask why I see 2 onnx models here?
That's weird, you should only need 1 onnx. What if you compare the perf using only 1 onnx? just set full fp16 and set mixed precision separately.