Zero Zeng
Zero Zeng
Looks like an issue of TPAT? TRT doesn't support the IsInf operator now, so it should be implemented as a plugin.
Can you share your onnx model and generated plugin code? Looks like this is why it failed, I would guess there is an unsupported cast operation in your model. ```...
> Is it because in TensorRT, Cast OP does not support the None dtype input when the output is bool? I think so.
> As I understand, step 1 should result in a quantized INT8 model. So I should expect a model which is at least 2x smaller in size and 2x faster...
No access. Have you exported the quantized model to ONNX and inference using TensorRT?
> No, I use torch-tensorrt and torchscript. Onnx export is not needed in this case, isn't it? I believe you need to export to ONNX and use TRT's ONNX parser...
https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/userguide.html#export-to-onnx
Can you provide a reproduce for this error? or your onnx model. I guess it's due to some attribute issue in your model.