I try to run DeepSeek-Coder-V2-Lite-Instruct-GGUF but doesn't work
Is this a llama-cpp version issue?
maybe https://github.com/ggerganov/llama.cpp/issues/7979 ?
I disabled flash attention and try to change the batch size. Now I can load the model but I have this output: end_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_i
I disabled flash attention and try to change the batch size. Now I can load the model but I have this output: end_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_id_i
probably related to a template you are using for this model, check your model file yaml, it should be something like this:
name: code-13b
context_size: 4096
f16: false # true to GPU acceleration
cuda: false # true to GPU acceleration
gpu_layers: 0 # this model have max 40 layers, 15-20 is reccomended for half-load at NVIDIA 4060 TiTan (more layers -- more VRAM required), (i guess 0 is no GPU)
parameters:
model: code-13b.Q5_K_M.gguf
stopwords:
- "</s>"
template:
chat: &template |
Below is an instruction that describes a task. Write a response that appropriately completes the request.
Instruction: {{.Input}}
Response:
# Modify the prompt template here ^^^ as per your requirements
completion: *template
in stop words you should input your models stop words
you can find them in logs of llama.cpp when it tries to load model in first time
Also cannot get this to work. Downloaded the model from the gallery using the gui /browse. This downloaded the model and a deepseek-coder-v2-lite-instruct.yaml with the contents below.
context_size: 8192
mmap: true
name: deepseek-coder-v2-lite-instruct
parameters:
model: DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M.gguf
stopwords:
- < ^|end ^v^aof ^v^asentence ^|>
template:
chat: |
{{.Input -}}
Assistant: # Space is preserved for templating reasons, but line does not end with one for the linter.
chat_message: |-
{{if eq .RoleName "user" -}}User: {{.Content }}
{{ end -}}
{{if eq .RoleName "assistant" -}}Assistant: {{.Content}}< ^|end ^v^aof ^v^asentence ^|>{{end}}
{{if eq .RoleName "system" -}}{{.Content}}
{{end -}}
completion: |
{{.Input}}
Model is loaded perfectly, however output looks like this:
ating linter linterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterlinterl...
So I am guessing its a template problem. Any ideas on how to run this model?
The same issue, 'deepseek-coder-v2-lite-instruct' from models repository is not usable.
BTW Does anyone have a guide how to 'translate' Ollama template to use in LocalAI?
same, i can't get this to work. it would be great to open up the configs so that we can edit them in the UI instead of having to create a new gallery. @mudler what you've made here is super powerful, but more customization would be greatly appreciated, and would likely reduce the number of issues like this one. in my opinion, at least part of the reason for a UI like this is so that users don't have to worry about or write code. but.. if we want to run different models, we have to go into the code. kind of defeats the purpose.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.