macos运行 ollama run openbmb/minicpm-v4.5 报500错误
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [x] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
- [x] 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
(base) shaoqiang@shaoqiangdeMacBook-Pro ~ % ollama run openbmb/minicpm-v4.5 Error: 500 Internal Server Error: llama runner process has terminated: error:attach failed: attach failed (Not allowed to attach to process. Look in the console messages (Console.app), near the debugserver entries, when the attach failed. The subsystem that denied the attach permission will likely have logged an informative message about why it was denied.)
期望行为 | Expected Behavior
No response
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):
备注 | Anything else?
No response
Same
clip.cpp:4169: Unknown minicpmv version
same
https://github.com/tc-mb/ollama/tree/MIniCPM-V https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/deployment/ollama/minicpm-v4_5_ollama.md
You can try the stable branch we provide to verify the model performance.
i cant make it work:
panchines@fedora:~/ollama/models$ ollama list
NAME ID SIZE MODIFIED
panchines@fedora:~/ollama/models$
ls -lh ggml-model-Q6_K.gguf mmproj-model-f16.gguf
-rw-r--r--. 1 panchines panchines 6,3G ago 27 21:49 ggml-model-Q6_K.gguf
-rw-r--r--. 1 panchines panchines 1,1G ago 27 21:44 mmproj-model-f16.gguf
panchines@fedora:~/ollama/models$ nano minicpmv4.5.Modelfile
panchines@fedora:~/ollama/models$ ollama create minicpm-v4.5 -f minicpmv4.5.Modelfile
gathering model components
copying file sha256:cc3b1fc458dd03f286ef92e27f041a7f8c9f086120a3b38ed45368cf06036f23 100%
copying file sha256:7a7225a32e8d453aaa3d22d8c579b5bf833c253f784cdb05c99c9a76fd616df8 100%
parsing GGUF
using existing layer sha256:cc3b1fc458dd03f286ef92e27f041a7f8c9f086120a3b38ed45368cf06036f23
using existing layer sha256:7a7225a32e8d453aaa3d22d8c579b5bf833c253f784cdb05c99c9a76fd616df8
creating new layer sha256:bc24d57b9016fd0527c4edc5a73b5e2ceecd10be5f1dea29864f4cae1424a7ef
creating new layer sha256:75357d685f238b6afd7738be9786fdafde641eb6ca9a3be7471939715a68a4de
writing manifest
success
panchines@fedora:~/ollama/models$ ollama run minicpm-v4.5
Error: llama runner process has terminated: exit status 2
panchines@fedora:~/ollama/models$ cat minicpmv4.5.Modelfile
FROM ./ggml-model-Q6_K.gguf
FROM ./mmproj-model-f16.gguf
TEMPLATE """{{- if .Messages }}{{- range $i, $_ := .Messages }}{{- $last := eq (len (slice $.Messages $i)) 1 -}}<|im_start|>{{ .Role }}{{ .Content }}{{- if $last }}{{- if (ne .Role "assistant") }}<|im_end|><|im_start|>assistant{{ end }}{{- else }}<|im_end|>{{ end }}{{- end }}{{- else }}{{- if .System }}<|im_start|>system{{ .System }}<|im_end|>{{ end }}{{ if .Prompt }}<|im_start|>user{{ .Prompt }}<|im_end|>{{ end }}<|im_start|>assistant{{ end }}{{ .Response }}{{ if .Response }}<|im_end|>{{ end }}"""
SYSTEM """You are a helpful assistant."""
panchines@fedora:~/ollama/models$
ggml-backend.cpp:750: pre-allocated tensor (cache_k_l6 (view) (copy of Kcur-6)) in a buffer (CUDA0) that cannot run the operation (CPY)