optimum
optimum copied to clipboard
C++ ONNX export of HuggingFace Transformers models
Feature request
Right now, it seems like the only implementation for ONNX exports of HuggingFace models is to use them in Python inferencing. It would be great we could export Hugging Face Transformers models to ONNX and use them in C++ applications.
Motivation
For C++ applications.
Your contribution
Will try, but probably not.
Thank you, I agree it would be very nice.
This library https://github.com/marella/ctransformers is close to what you are suggesting, except that it is using GGML instead of ONNX Runtime for the inference.