inference
inference copied to clipboard
BERT model int8 comparison
Can onnx(from tf) and torch(from hugging) match the corresponding model under the int8 model? The operators of the last few layers of the onnx model in the current int8 mode and the torch model are completely inconsistent with the final model output? Inconvenient to use and performance analysis comparison