Hugging Face Onnx Models
Is it possible to run Onnx models hosted in HF? I tried to keep download model in the model folder, but it is not showing using foundry cache list
@anktsrkr thanks for raising the issue! Yes - it is possible to download models from HuggingFace (HF) and consume them in Foundry Local. The only caveat it that the model should have a genai_config.json file in the folder on HF.
It is likely that when you downloaded from HF that it created symbolic links. To download into your cache directory without symbolic links:
huggingface-cli download REPO --local-dir ~/.foundry/cache/models
Where REPO is the name of the HF repository. If the REPO has many directories then you can download a specific directory using the --include option.
[!NOTE] We're working on having native HF support (pulling ONNX models) in Foundry Local in an upcoming update.
Hey @samuel100 I tried the approach you described and was trying to run Qwen3-0.6B-ONNX. I renamed generation_config.json to genai_config.json, was able to discover -
However, while I am trying to run the model getting
while i compare with the phi4 which I have download, I certainly can see that has more properties.
Also, when I see genai_config.json I could imaging renaming is not the correct step.
Not sure what is the next step.
Btw! Thanks for the heads up about native HF support!
So, the genai_config.json is different to the generation_config.json. The genai_config.json is used by ONNX runtime and the repo you are using does not have that available.
Probably, the easiest way to create Qwen3-0.6B for Foundry Local is to use Olive. Detailed documentation on how to do this can be found on Compile Hugging Face models to run on Foundry Local.
In step 1 of the documentation it shows how to compile the meta-llama/Llama-3.2-1B-Instruct model. You'd need to update the command to:
olive auto-opt \
--model_name_or_path Qwen/Qwen3-0.6B \
--trust_remote_code \
--output_path models/Qwen3-0.6B \
--device cpu \
--provider CPUExecutionProvider \
--use_model_builder \
--use_ort_genai \
--precision int4 \
--log_level 1
I appreciate having to run this yourself is a pain. Let me follow up with our team on how we can make this easier going forward.