Zero Zeng comments

Results 582 comments of


                                            Zero Zeng

Expose additional ONNX protobuf model properties via nvonnxparser::IParser

@kevinch-nv ^ ^

There is some questions about pytorch_quantization

@ttyio ^ ^

Conversion Error for IsInf OP

Looks like an issue of TPAT? TRT doesn't support the IsInf operator now, so it should be implemented as a plugin.

Conversion Error for IsInf OP

Can you share your onnx model and generated plugin code? Looks like this is why it failed, I would guess there is an unsupported cast operation in your model. ```...

Conversion Error for IsInf OP

> Is it because in TensorRT, Cast OP does not support the None dtype input when the output is bool? I think so.

Size of model and inference time is same as FP32 after calibration/quatization step.

> As I understand, step 1 should result in a quantized INT8 model. So I should expect a model which is at least 2x smaller in size and 2x faster...

Size of model and inference time is same as FP32 after calibration/quatization step.

No access. Have you exported the quantized model to ONNX and inference using TensorRT?

Size of model and inference time is same as FP32 after calibration/quatization step.

> No, I use torch-tensorrt and torchscript. Onnx export is not needed in this case, isn't it? I believe you need to export to ONNX and use TRT's ONNX parser...

Size of model and inference time is same as FP32 after calibration/quatization step.

https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/userguide.html#export-to-onnx

use CustomGeluPluginDynamic segmentation fault!

Can you provide a reproduce for this error? or your onnx model. I guess it's due to some attribute issue in your model.