gpt-neox
gpt-neox copied to clipboard
ONNX Export / Inference Engine
ONNX is a common export format to convert models for deployment.
Describe the solution you'd like
a tool comand line that would export the model as a usable ONNX file for the NVIDIA Triton (TensorRT) engine.
- [ ] prototype the command line
- [ ] Investigate what operations used by gpt-neox are not currently supported by ONNX