Kevin Chen
Kevin Chen
Closing this issue as it seems like it is resolved. If not, feel free to re-open.
after converting onnx fp32 to int8 engine with custom calibration, the engine layers still show fp32
Can you share your model? Testing this workflow with https://github.com/onnx/models/blob/main/validated/vision/classification/resnet/model/resnet50-v1-12.onnx the output is as expected.
Fixed. Do I need to fix the co-pilot suggestion as well?
@chilo-ms can you help run CI again?
You are correct, currently the concatenation elimination pass is unsupported for plugin nodes. I'll see to updating the TensorRT developer guide about this. Do you have motivating use case where...
This is a known issue, with the current pinned versions of onnx inside onnx-tensorrt repository this issue should not occur. We have a plan to eventually remove the dependency on...
@OctaAIVision trying to understand the full story here - Does engine 1 and engine 2 both have dynamic shapes? What are the profiles that you've set for them? - Does...
@labderrafie are you still encountering this issue with TRT 10? We recommend using ModelOpt for model quantization now: https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/examples/onnx_ptq/README.md
From these results it suggests that FP16 kernels have similar performance to INT8. This is true for a few model architectures - Tagging @nvpohanh to help investigate.
No further development is planned for 7.1 or 8.5. Please use the most recent version available for your device.