lightseq
lightseq copied to clipboard
Does LightSeq support ONNX export and Triton Inference Server?
Hi team, QQ: does lightseq support the followings,
- Convert HuggingFace BERT/RoBERTa models to
int8precision directly - If yes, can the converted model be exported to ONNX format directly?
- If so, can the exported ONNX model loaded correctly using Triton Inference Server?
Converting without calibration or finetune will cause loss of accuracy. So currently not supported.