lightseq icon indicating copy to clipboard operation
lightseq copied to clipboard

Does LightSeq support ONNX export and Triton Inference Server?

Open stevezheng23 opened this issue 2 years ago • 1 comments

Hi team, QQ: does lightseq support the followings,

  • Convert HuggingFace BERT/RoBERTa models to int8 precision directly
  • If yes, can the converted model be exported to ONNX format directly?
  • If so, can the exported ONNX model loaded correctly using Triton Inference Server?

stevezheng23 avatar Nov 02 '22 01:11 stevezheng23

Converting without calibration or finetune will cause loss of accuracy. So currently not supported.

neopro12 avatar Jan 10 '23 03:01 neopro12