MiniCPM-V icon indicating copy to clipboard operation
MiniCPM-V copied to clipboard

How to use Gradio with GGUF model?

Open AndyZocker opened this issue 11 months ago • 3 comments

I was able to install everything successfully on Windows. But I can't load a GGUF model. I entered "openbmb/MiniCPM-o-2_6-gguf" in the model_server.py, but I get an error that no config.json was found. I'm really only interested in real time voice chat, but I don't think the big standard model without gguf will run on my RTX 3060 with 12gb. Does something have to be changed in the code or how do you get GGUF models to work with the gradio that was ordered? The videos also show that it even runs on an iPad and that certainly doesn't use the large model, right? Thanks in advance for any help

AndyZocker avatar Jan 21 '25 18:01 AndyZocker

You can try this int4 version, and you only need to replace the model initialization to AutoGPTQForCausalLM.from_quantized in the model_server.py,

YuzaChongyi avatar Jan 22 '25 04:01 YuzaChongyi

I still don't understand which code i need to change in the model_server.py....is there a tutorial for dummys? and i also keep getting a error for flash attention which i did install after finally finding a version which work on my computer

AndyZocker avatar Jan 25 '25 17:01 AndyZocker

same issue,solved?

BlackTea-c avatar Feb 10 '25 09:02 BlackTea-c

https://github.com/OpenBMB/MiniCPM-o/blob/main/web_demos/minicpm-o_2.6/model_server.py#L96 try to change this line~

Cuiunbo avatar Feb 17 '25 04:02 Cuiunbo