lightseq Does LightSeq support ONNX export and Triton Inference Server?

Does LightSeq support ONNX export and Triton Inference Server?

Open stevezheng23 opened this issue 2 years ago • 1 comments

Hi team, QQ: does lightseq support the followings,

Convert HuggingFace BERT/RoBERTa models to int8 precision directly
If yes, can the converted model be exported to ONNX format directly?
If so, can the exported ONNX model loaded correctly using Triton Inference Server?

Nov 02 '22 01:11 stevezheng23

Converting without calibration or finetune will cause loss of accuracy. So currently not supported.

Jan 10 '23 03:01 neopro12