gpt-neox ONNX Export / Inference Engine

ONNX Export / Inference Engine

Open Mistobaan opened this issue 3 years ago • 0 comments

ONNX is a common export format to convert models for deployment.

Describe the solution you'd like

a tool comand line that would export the model as a usable ONNX file for the NVIDIA Triton (TensorRT) engine.

[ ] prototype the command line
[ ] Investigate what operations used by gpt-neox are not currently supported by ONNX

Feb 10 '22 20:02 Mistobaan