optimum
optimum copied to clipboard
Make optimized ONNX external data files downloadable from Hugging Face Hub
What does this PR do?
This PR adds .data
to the external data file suffixes so that an optimized model can be downloaded from Hugging Face as the optimizer saves models with .data
instead of _data
.
Fixes # (issue)
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [ ] Did you make sure to update the documentation with your changes?
- [ ] Did you write any new necessary tests?
Hi @kazssym, Optimum ONNX export uses model.onnx_data
for the external data file, contrary to other tools that use model.onnx.data
. Hence the check on the Hub. Has this caused problems to you?
Hi @kazssym, Optimum ONNX export uses
model.onnx_data
for the external data file, contrary to other tools that usemodel.onnx.data
. Hence the check on the Hub. Has this caused problems to you?
When I stored my external data file as .onnx_data
on the Hub, it could be downloaded by Optimum but ONNX Runtime failed to load the file as it assumes .onnx.data
internally. Renaming to .onnx.data
on the Hub makes Optimum ignore the file and ONNX Runtime cannot use it anyway. It is inconvenient if I must manually download model data from the Hub.
DELETED
Thank you @kazssym, I am not sure to understand. Could you open an issue about that with a reproduction of the issue?
For example,
optimum-cli export onnx --model saibo/llama-1B llama_onnx
followed by
from optimum.onnxruntime import ORTModelForCausalLM
model = ORTModelForCausalLM.from_pretrained("/path/to/llama_onnx")
works for me.
Thank you @kazssym, I am not sure to understand. Could you open an issue about that with a reproduction of the issue?
For example,
optimum-cli export onnx --model saibo/llama-1B llama_onnx
followed by
from optimum.onnxruntime import ORTModelForCausalLM model = ORTModelForCausalLM.from_pretrained("/path/to/llama_onnx")
works for me.
I tried to upload an exported ONNX model to Hugging Face Hub and let it be downloaded with Optimum.
I must upload it with _data
suffix but ONNX Runtime could not find it.
https://huggingface.co/kazssym/stablelm-3b-4e1t-onnx-fp32/tree/b9c9eecad782f82dbb163448019a7d2ea1c3f2f2
Then I renamed the file with .data
and it could not be downloaded by Optimum.
@kazssym Thanks, let me try.
I found it needs more changes for this problem and switched this PR to draft now.
Issue #1736
Updated this PR to cope with this case: https://github.com/huggingface/optimum/blob/fd47a73267c3a71ea4e3c02f92260ae61c5ae372/optimum/onnxruntime/optimization.py#L225