What does this PR do?

This PR adds .data to the external data file suffixes so that an optimized model can be downloaded from Hugging Face as the optimizer saves models with .data instead of _data.

Fixes # (issue)

Before submitting

[ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[ ] Did you make sure to update the documentation with your changes?
[ ] Did you write any new necessary tests?

Feb 27 '24 11:02 kazssym

Hi @kazssym, Optimum ONNX export uses model.onnx_data for the external data file, contrary to other tools that use model.onnx.data. Hence the check on the Hub. Has this caused problems to you?

Feb 28 '24 14:02 fxmarty

Hi @kazssym, Optimum ONNX export uses model.onnx_data for the external data file, contrary to other tools that use model.onnx.data. Hence the check on the Hub. Has this caused problems to you?

When I stored my external data file as .onnx_data on the Hub, it could be downloaded by Optimum but ONNX Runtime failed to load the file as it assumes .onnx.data internally. Renaming to .onnx.data on the Hub makes Optimum ignore the file and ONNX Runtime cannot use it anyway. It is inconvenient if I must manually download model data from the Hub.

DELETED

Feb 29 '24 00:02 kazssym

Thank you @kazssym, I am not sure to understand. Could you open an issue about that with a reproduction of the issue?

For example,

optimum-cli export onnx --model saibo/llama-1B llama_onnx

followed by

from optimum.onnxruntime import ORTModelForCausalLM

model = ORTModelForCausalLM.from_pretrained("/path/to/llama_onnx")

works for me.

Feb 29 '24 10:02 fxmarty

Thank you @kazssym, I am not sure to understand. Could you open an issue about that with a reproduction of the issue?

For example,
optimum-cli export onnx --model saibo/llama-1B llama_onnx
followed by
from optimum.onnxruntime import ORTModelForCausalLM



model = ORTModelForCausalLM.from_pretrained("/path/to/llama_onnx")
works for me.

I tried to upload an exported ONNX model to Hugging Face Hub and let it be downloaded with Optimum.

I must upload it with _data suffix but ONNX Runtime could not find it.

https://huggingface.co/kazssym/stablelm-3b-4e1t-onnx-fp32/tree/b9c9eecad782f82dbb163448019a7d2ea1c3f2f2

Then I renamed the file with .data and it could not be downloaded by Optimum.

Feb 29 '24 11:02 kazssym

@kazssym Thanks, let me try.

Feb 29 '24 11:02 fxmarty

I found it needs more changes for this problem and switched this PR to draft now.

Feb 29 '24 11:02 kazssym

Issue #1736

Feb 29 '24 14:02 kazssym

Updated this PR to cope with this case: https://github.com/huggingface/optimum/blob/fd47a73267c3a71ea4e3c02f92260ae61c5ae372/optimum/onnxruntime/optimization.py#L225

Mar 03 '24 07:03 kazssym

optimum
optimum copied to clipboard

Make optimized ONNX external data files downloadable from Hugging Face Hub

What does this PR do?

Before submitting

optimum optimum copied to clipboard

Make optimized ONNX external data files downloadable from Hugging Face Hub

What does this PR do?

Before submitting

optimum
optimum copied to clipboard