optimum
optimum copied to clipboard
Phi3 support
Feature request
Microsoft's new phi3 mode, in particular the 128K context mini model, is not supported by Optimum export.
Error is:
"ValueError: Trying to export a phi3 model, that is a custom or unsupported architecture, but no custom export configuration was passed as custom_export_configs
. Please refer to https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/export_a_model#custom-export-of-transformers-models for an example on how to export custom models. Please open an issue at https://github.com/huggingface/optimum/issues if you would like the model type phi3 to be supported natively in the ONNX export."
Motivation
Phi3-mini is potentially very significant as it has a large context but a small size. This could be used in lots of scenarios if it has good performance.
Your contribution
Unlikely I could do a PR as ONNX work is not my forte.
https://github.com/huggingface/optimum/blob/56aabbebd0ce532f82f566a2a946769cee3bb36b/optimum/utils/normalized_config.py#L254
Add "phi3": NormalizedTextConfig
in this dict and you seem to be all set for phi3-mini
patching the TasksManager
and NormalizedConfigManager
works (until it's added naively):
from transformers import AutoTokenizer
from optimum.exporters import TasksManager
from optimum.exporters.onnx import main_export
from optimum.onnxruntime import ORTModelForCausalLM
from optimum.utils import NormalizedConfigManager
TasksManager._SUPPORTED_MODEL_TYPE["phi3"] = TasksManager._SUPPORTED_MODEL_TYPE["phi"]
NormalizedConfigManager._conf["phi3"] = NormalizedConfigManager._conf["phi"]
# output = "phi3_onnx"
# main_export(
# model_name_or_path="microsoft/Phi-3-mini-4k-instruct",
# task="text-generation-with-past",
# trust_remote_code=True,
# output=output,
# )
model = ORTModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct", trust_remote_code=True, export=True)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
inputs = tokenizer(["Hello, my dog is cute"], return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
Although I had following patch applied, it still cann't export Phi-3 ONNX model.
commit db6db6fc6a0690bce501569ab384f1bf10a2c7da Author: kunal-vaishnavi [email protected] Date: Thu May 9 02:12:55 2024 -0700
Add Phi-3 mini to Optimum (#1841)
* Load config from folder
* Add Phi-3 to normalized config
My command is as follows: ptimum-cli export onnx -m ./phi3_128k_lora_sft --monolith --task text-generation --trust-remote-code --framework pt ./phi3_128k_lora_sft_onnx_3
The output and error are as following:
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:07<00:00, 1.85s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Traceback (most recent call last):
File "/usr/local/bin/optimum-cli", line 8, in custom_onnx_configs
. Please refer to https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/export_a_model#custom-export-of-transformers-models for an example on how to export custom models. Please open an issue at https://github.com/huggingface/optimum/issues if you would like the model type phi3 to be supported natively in the ONNX export.
With the PR #1841 submitted by the ORT team, we will be able to load onnx checkpoints(eg. microsoft/Phi-3-mini-4k-instruct-onnx
) of phi3 uploaded by the ORT team. But for exporting the model we would need to add an onnx config for phi3. I will open a PR today.
For phi3 small, the team will wait until it becomes a part of a stable transformers release (it is using remote code now).