frob
frob
deepseek architecture is different to most other models, and ollama sometimes misjudges how much of the model can be loaded into VRAM. You can compensate by decreasing the number of...
Has this been resolved?
[Serve logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) may aid in debugging.
Yes, this looks like ollama being too optimistic about how many layers it can offload. It's figuring it can use [46.8 GiB 46.4 GiB] of [47.3 GiB 47.3 GiB], ie...
Did the mitigations help?
Which mitigations helped?
Can you add logs for that failure?
OK, this is different to the OOM issues. The deepseek class of models don't support shifting the context window when the buffer fills up, see [here](https://github.com/ollama/ollama/issues/5975).
Deepseek k-shift problem resolved with #9433.
On the top right of the model page there is a button "Use this model", click on "Ollama", click on "Copy", paste in to a terminal window.