Zero Zeng comments

Results 571 comments of


                                            Zero Zeng

[castLayer.cpp::validate::33] Error Code 2: Internal Error (Assertion !mOutputTypes.at(0).hasValue() || mOutputTypes.at(0).value() == params.toType failed. )

Just `trtexec --onnx=folded.onnx --fp16 --int8`

[castLayer.cpp::validate::33] Error Code 2: Internal Error (Assertion !mOutputTypes.at(0).hasValue() || mOutputTypes.at(0).value() == params.toType failed. )

@nvpohanh is it a bug?

GPU Latency failure for FP16, INT8, mixed precision (FP16+INT8) models of TensorRT 8.6 when running trtexec on GPU A100

How is the perf of ``` trtexec\ --onnx={os.path.join(SAVE_PATH, '.onnx')} \ --fp16 --int8 \ --minShapes=input:1x3x224x224 \ --optShapes=input:10x3x224x224 \ --maxShapes=input:64x3x224x224 \ --explicitBatch\ --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw \ --saveEngine={os.path.join(SAVE_PATH, '.trt')} ```

GPU Latency failure for FP16, INT8, mixed precision (FP16+INT8) models of TensorRT 8.6 when running trtexec on GPU A100

Could you please share the onnx here? If it's a QAT model, `--int8` should be required otherwise TRT will throw an error.

GPU Latency failure for FP16, INT8, mixed precision (FP16+INT8) models of TensorRT 8.6 when running trtexec on GPU A100

You may hit a known issue in TRT 8.6 and it's fixed in TRT 9.2. could you please try the latest TRT 9.2? you can download it from below link:...

GPU Latency failure for FP16, INT8, mixed precision (FP16+INT8) models of TensorRT 8.6 when running trtexec on GPU A100

You just hit a bug that fix in TRT 9.2 :-)

GPU Latency failure for FP16, INT8, mixed precision (FP16+INT8) models of TensorRT 8.6 when running trtexec on GPU A100

Could you please share the onnx that can reproduce this issue?

GPU Latency failure for FP16, INT8, mixed precision (FP16+INT8) models of TensorRT 8.6 when running trtexec on GPU A100

We didn't release it in the official docker image since it's a limited EA release. but you can build the docker by using https://github.com/NVIDIA/TensorRT/blob/release/9.2/docker/ubuntu-20.04.Dockerfile

GPU Latency failure for FP16, INT8, mixed precision (FP16+INT8) models of TensorRT 8.6 when running trtexec on GPU A100

> https://drive.google.com/drive/folders/1DPb0HigtNiI9Z8TCn7z0HTL0PYsPYwJ4?usp=sharing May I ask why I see 2 onnx models here?

GPU Latency failure for FP16, INT8, mixed precision (FP16+INT8) models of TensorRT 8.6 when running trtexec on GPU A100

That's weird, you should only need 1 onnx. What if you compare the perf using only 1 onnx? just set full fp16 and set mixed precision separately.