DeepSeek-Coder-V2 Model crashes after a while

Model: deepseek-coder-v2:16b-lite-instruct-q4_K_M (From Ollama repo)

GPU: NVIDIA RTX3060 (12GB)

Ollama version: 0.5.5

UI: Open-WebUI 0.5.4

Description: After a few prompts, DeepSeek Coder V2 (Lite) stops generating the response and the model stops running (GPU usage goes to 0%). If you try to force it to keep generating the answer, the model initializes but crashes again (instantly), and sometimes generates random information about algebra for a couple of seconds before going down.

Ollama log:

llm_load_vocab: control-looking token: 100002 '<｜fim▁hole｜>' was not control-type; this is probably a bug in the model. its type will be overridden
llm_load_vocab: control-looking token: 100004 '<｜fim▁end｜>' was not control-type; this is probably a bug in the model. its type will be overridden
llm_load_vocab: control-looking token: 100003 '<｜fim▁begin｜>' was not control-type; this is probably a bug in the model. its type will be overridden

Relevant info: It seems that more users are experiencing the same/similar issue: https://www.reddit.com/r/SillyTavernAI/comments/1hzuzrf/deepseek_on_openrouter_stops_responding_after/

Jan 15 '25 02:01 yankluf

i have this problem too and using almost the same setup. any alternatives?

Mar 15 '25 14:03 glamosky

I still haven't found a solution. This model codes really well, it's a pity that there isn't a fix yet for this problem.

Mar 16 '25 02:03 yankluf