MiniCPM-V macos运行 ollama run openbmb/minicpm-v4.5 报500错误

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

[x] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

[x] 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

(base) shaoqiang@shaoqiangdeMacBook-Pro ~ % ollama run openbmb/minicpm-v4.5 Error: 500 Internal Server Error: llama runner process has terminated: error:attach failed: attach failed (Not allowed to attach to process. Look in the console messages (Console.app), near the debugserver entries, when the attach failed. The subsystem that denied the attach permission will likely have logged an informative message about why it was denied.)

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

备注 | Anything else?

No response

Aug 28 '25 10:08 shaofq

Same

clip.cpp:4169: Unknown minicpmv version

Aug 28 '25 10:08 RoversX

same

Aug 28 '25 10:08 zs1379

https://github.com/tc-mb/ollama/tree/MIniCPM-V https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/deployment/ollama/minicpm-v4_5_ollama.md

You can try the stable branch we provide to verify the model performance.

Aug 28 '25 13:08 tc-mb

i cant make it work:

panchines@fedora:~/ollama/models$ ollama list
NAME    ID    SIZE    MODIFIED 
panchines@fedora:~/ollama/models$     
ls -lh ggml-model-Q6_K.gguf mmproj-model-f16.gguf

  
-rw-r--r--. 1 panchines panchines 6,3G ago 27 21:49 ggml-model-Q6_K.gguf
-rw-r--r--. 1 panchines panchines 1,1G ago 27 21:44 mmproj-model-f16.gguf
panchines@fedora:~/ollama/models$ nano minicpmv4.5.Modelfile
panchines@fedora:~/ollama/models$ ollama create minicpm-v4.5 -f minicpmv4.5.Modelfile
gathering model components 
copying file sha256:cc3b1fc458dd03f286ef92e27f041a7f8c9f086120a3b38ed45368cf06036f23 100% 
copying file sha256:7a7225a32e8d453aaa3d22d8c579b5bf833c253f784cdb05c99c9a76fd616df8 100% 
parsing GGUF 
using existing layer sha256:cc3b1fc458dd03f286ef92e27f041a7f8c9f086120a3b38ed45368cf06036f23 
using existing layer sha256:7a7225a32e8d453aaa3d22d8c579b5bf833c253f784cdb05c99c9a76fd616df8 
creating new layer sha256:bc24d57b9016fd0527c4edc5a73b5e2ceecd10be5f1dea29864f4cae1424a7ef 
creating new layer sha256:75357d685f238b6afd7738be9786fdafde641eb6ca9a3be7471939715a68a4de 
writing manifest 
success 
panchines@fedora:~/ollama/models$ ollama run minicpm-v4.5
Error: llama runner process has terminated: exit status 2
panchines@fedora:~/ollama/models$ cat minicpmv4.5.Modelfile 
FROM ./ggml-model-Q6_K.gguf
FROM ./mmproj-model-f16.gguf

TEMPLATE """{{- if .Messages }}{{- range $i, $_ := .Messages }}{{- $last := eq (len (slice $.Messages $i)) 1 -}}<|im_start|>{{ .Role }}{{ .Content }}{{- if $last }}{{- if (ne .Role "assistant") }}<|im_end|><|im_start|>assistant{{ end }}{{- else }}<|im_end|>{{ end }}{{- end }}{{- else }}{{- if .System }}<|im_start|>system{{ .System }}<|im_end|>{{ end }}{{ if .Prompt }}<|im_start|>user{{ .Prompt }}<|im_end|>{{ end }}<|im_start|>assistant{{ end }}{{ .Response }}{{ if .Response }}<|im_end|>{{ end }}"""

SYSTEM """You are a helpful assistant."""
panchines@fedora:~/ollama/models$

ggml-backend.cpp:750: pre-allocated tensor (cache_k_l6 (view) (copy of Kcur-6)) in a buffer (CUDA0) that cannot run the operation (CPY)

Aug 28 '25 16:08 9acca9