TensorRT
TensorRT copied to clipboard
🐛 [Bug] Conversion error when using torch-TRT to run the bert model after qat quantization
Bug Description
When using the latest code to test the bert model after qat quantization, the following error occurs and the model cannot be run.

Error corresponds to the code location ( https://github.com/pytorch/TensorRT/blob/master/core/partitioning/shape_analysis.cpp#L93 )
Through log analysis, it is found that, First, when using the latest code, the bert model will be divided into subgraphs because it supports tuple ( https://github.com/pytorch/TensorRT/blob/master/core/compiler.cpp#L434 ); secondly, since the freeze process will not be triggered in the QAT (int8) mode ( https://github.com/pytorch/TensorRT/blob/master/core/lowering/lowering.cpp#L90 ), in the process of converting and dividing the subgraph, the weight is also used as a input, it will make the sub-graph input more after segmentation (3->398) to trigger the above error.
Further try to roll back the code to version 1.1, the code without the tuple function can run the qat model corresponding to bert ( https://github.com/pytorch/TensorRT/blob/release/1.1/core/compiler.cpp#L423 ),and the subgraph segmentation process is not triggered
To Reproduce
step1: bert model download ( https://zenodo.org/record/4792496#.YyBmlhNBxJU ) step2: Follow the documentation steps to generate the jit.trace model corresponding to qat step3: jit.load step4:torch.compile
Hi @lixiaolx is the model trained using the PyTorch QAT toolkit?
Hi @lixiaolx is the model trained using the PyTorch QAT toolkit?
Yes, this bert model is using pytorch QAT tools. I can run it on the version of torch-tensorrt that does not support the tuple function. After using the latest version, the segmentation model graph is triggered, and the above error occurs.,This mistake I located is in the shape analysis part ( https://github.com/pytorch/TensorRT/blob/master/core/partitioning/shape_analysis.cpp#L93 )
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days