Model crashes after a while
Model: deepseek-coder-v2:16b-lite-instruct-q4_K_M (From Ollama repo)
GPU: NVIDIA RTX3060 (12GB)
Ollama version: 0.5.5
UI: Open-WebUI 0.5.4
Description: After a few prompts, DeepSeek Coder V2 (Lite) stops generating the response and the model stops running (GPU usage goes to 0%). If you try to force it to keep generating the answer, the model initializes but crashes again (instantly), and sometimes generates random information about algebra for a couple of seconds before going down.
Ollama log:
llm_load_vocab: control-looking token: 100002 '<|fim▁hole|>' was not control-type; this is probably a bug in the model. its type will be overridden
llm_load_vocab: control-looking token: 100004 '<|fim▁end|>' was not control-type; this is probably a bug in the model. its type will be overridden
llm_load_vocab: control-looking token: 100003 '<|fim▁begin|>' was not control-type; this is probably a bug in the model. its type will be overridden
Relevant info: It seems that more users are experiencing the same/similar issue: https://www.reddit.com/r/SillyTavernAI/comments/1hzuzrf/deepseek_on_openrouter_stops_responding_after/
i have this problem too and using almost the same setup. any alternatives?
I still haven't found a solution. This model codes really well, it's a pity that there isn't a fix yet for this problem.