frob comments

Results 843 comments of


                                            frob

Deepseek (various) 236b crashes on run

deepseek architecture is different to most other models, and ollama sometimes misjudges how much of the model can be loaded into VRAM. You can compensate by decreasing the number of...

Deepseek (various) 236b crashes on run

Has this been resolved?

Deepseek (various) 236b crashes on run

[Serve logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) may aid in debugging.

Deepseek (various) 236b crashes on run

Yes, this looks like ollama being too optimistic about how many layers it can offload. It's figuring it can use [46.8 GiB 46.4 GiB] of [47.3 GiB 47.3 GiB], ie...

Deepseek (various) 236b crashes on run

Did the mitigations help?

Deepseek (various) 236b crashes on run

Which mitigations helped?

Deepseek (various) 236b crashes on run

Can you add logs for that failure?

Deepseek (various) 236b crashes on run

OK, this is different to the OOM issues. The deepseek class of models don't support shifting the context window when the buffer fills up, see [here](https://github.com/ollama/ollama/issues/5975).

Deepseek (various) 236b crashes on run

Deepseek k-shift problem resolved with #9433.

mradermacher / deepseek-moe-16b

On the top right of the model page there is a button "Use this model", click on "Ollama", click on "Copy", paste in to a terminal window.