frob
frob
https://github.com/ollama/ollama/issues/5975
An example would be helpful in debugging.
You can't call a model ".". Try: ``` ollama create my-new-llama3-model ```
If you wrap your program in a markdown code block (\`\`\`) it will be rendered better. [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) with `OLLAMA_DEBUG=1` may show useful information, but my guess is that the...
``` avril 27 11:22:27 jarvis-server ollama[1145305]: time=2025-04-27T11:22:27.684Z level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=36 layers.split="" memory.available="[23.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="24.4 GiB" memory.required.partial="22.7 GiB" memory.required.kv="640.0 MiB" memory.required.allocations="[22.7 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7...
This is performing as expected. ``` avril 30 00:04:08 jarvis-server ollama[1175]: time=2025-04-30T00:04:08.690Z level=INFO source=server.go:138 msg=offload library=cuda layers.requested=41 layers.model=41 layers.offload=37 layers.split="" memory.available="[23.3 GiB]" memory.gpu_overhead="0 B" memory.required.full="24.4 GiB" memory.required.partial="23.0 GiB" memory.required.kv="640.0 MiB"...
Everybody has different hardware and so the layer offloading can't be enforced in the model download parameters. There is work in progress in making the estimation more accurate, which will...
This would appear to be a problem with langgraph or your app. It's trying to send a message to ollama with a role that langgraph thinks is not supported by...
Until other quants are made available, if you are willing to pull the fp16 quant, you can make your own q5: ```console $ ollama pull llama3.2-vision:11b-instruct-fp16 $ echo FROM llama3.2-vision:11b-instruct-fp16...
https://ollama.com/frob/llama3.2-vision:11b-instruct-q5_K_M