Eric Curtin
Eric Curtin
We will be enabling the functionality to push and pull models to OCI registries in RamaLama pending completion of the new "podman artifact" command: https://github.com/containers/ramalama
> Tried again as I see gtx1100 listed on vllm build docs. It wasn't smooth but got it running at least locally on Fedora 41 and https://repo.radeon.com/rocm/el9/6.3.4/ packages. No time...
Users have hit this in RamaLama also: ``` Attempted to download Gemma3 from Ollama registry with ramalama run gemma3 Name pulled from https://www.ollama.com/library/gemma3 Got an error when running ramalama run...
Just tagging @ochafik and @jan-wassenberg for awareness
Likely related: https://github.com/ggml-org/llama.cpp/issues/12857 Is this Ollama-specific? The above issue doesn't seem to be an Ollama model
I do think we should try and fix this one way or another, gemma3 is a very popular model: ``` $ ramalama run gemma3 Loading modelllama_model_load: error loading model: error...
I'm pointing people to these instead for now: https://github.com/containers/ramalama/pull/1288/files
I think that GPU is gfx1103 . Can you check if the relevant file in in the container in /opt ? (It should have gfx1103 in the filename)
You could just be simply running out of VRAM, how much VRAM does your GPU have?
If: llama3.2:1b works, you are likely running out of VRAM, I think by default it's 3b.