ollama
ollama copied to clipboard
Getting Unsupported architecture error When Importing Llama-vision.
What is the issue?
I tried to import finetuned llama-3.2-11b-vision, but I got "Error: unsupported architecture."
In order to make sure my model is not the problem, I downloaded meta-llama/Llama-3.2-11B-Vision-Instruct from Huggingface.
I copied modelfile from Ollama show llama3.2-vision --modelfile.
Then edited the modelfile and pointed FROM to the downloaded model from HF.
When I run ollama create llama-vision -f llama-vision.modelfile, I get this:
transferring model data 100%
converting model
Error: unsupported architecture
OS
macOS
GPU
Apple
CPU
Apple
Ollama version
0.4.0
plus 1 any safetensors model gets the same error
The safetensors architectures that are currently supported are:
- Llama 2 and 3 (not the vision models yet unfortunately)
- Gemma 1 and 2
- Bert
- Mixtral
- Phi3
you can watch the convert/convert.go files for changes to func ConvertModel
Is there a command to convert it to gguf and import it?
plus 1 any safetensors model gets the same error
And plus me makes three. I used hfdowloader then tried to use model file to convert oxyapi/oxy-1-small being it was based on qwen2.5, shame on me.
Any update on this?
An update (apologies for this going into the weeds on the technical details):
TLDR; gemma3 works now, mllama will work soonish (hopefully in the next month).
The longer details:
You can import gemma3 w/ the vision projector through ollama create directly from safetensors (as of 0.6.2 this will also work with the --quantize argument). We combined both the text model and the vision projector into a single image just to make this easier to deal with in the Modelfile. This was because gemma3 will only be supported in the new ollama engine.
The mllama architecture (llama3.2 vision) is still a little bit tricky because it can run in the new ollama engine or in the old llama.cpp engine. We did not combine the vision/text parts of the model for this, so when we change this to work more similarly to gemma3 it still needs to run in both modes. I think we'll do this relatively soon so that way we can move everything over to the new ollama engine instead of supporting it running on llama.cpp.
Yep, I was able to import finetuned gemma3 from Safetensors. Hopefully we can do this with Llama-vision soon as well! :)
An update (apologies for this going into the weeds on the technical details):
TLDR; gemma3 works now, mllama will work soonish (hopefully in the next month).
The longer details:
You can import gemma3 w/ the vision projector through
ollama createdirectly from safetensors (as of0.6.2this will also work with the--quantizeargument). We combined both the text model and the vision projector into a single image just to make this easier to deal with in the Modelfile. This was because gemma3 will only be supported in the new ollama engine.The mllama architecture (llama3.2 vision) is still a little bit tricky because it can run in the new ollama engine or in the old llama.cpp engine. We did not combine the vision/text parts of the model for this, so when we change this to work more similarly to gemma3 it still needs to run in both modes. I think we'll do this relatively soon so that way we can move everything over to the new ollama engine instead of supporting it running on llama.cpp.
@pdevine thanks for the update. Being able to do this with llama vision will be a game changer for many I'm sure. I am having too many issues trying to import a model right now. Please let me know if there is anything I can do to help.