MiniGPT-4
MiniGPT-4 copied to clipboard
Any idea if this will work on CPU?
First of all, thanks for this great project! The output quality seems very good, and the idea of running a multimodal model to work locally is awesome. It seems we already have a GPT-4 like multimodal model in our hands, so very exciting. I was wondering if it is possible to run with llama.cpp on CPU? I am currently running Vicuna-13b on CPU (the 4-bit quantized version) - around 8 GB Ram is enough. It works just fine, and the inference speed is about 1.5 tokens per second for my computer. (lt also seems to work on mobile phones with enough memory. I did not try it, but I saw a few examples). llama.cpp has their own file format (ggml), and provide a way to convert from original weights to ggml. It would be great if people with low VRAM or no VRAM can make it work on CPU. Any thoughts?
yep,I need help to run on CPU
I am downloading and merging models
Hi! Has there been any progress on running it on a CPU? I'm really interested in this as well, since I don't have a powerful GPU. Any updates or workarounds you've discovered would be greatly appreciated. Thanks!
yep,I need help to run on CPU
I am downloading and merging models
@kenneth104 Any progress of using CPU was made? Can you share?
yep,I need help to run on CPU I am downloading and merging models
@kenneth104 Any progress of using CPU was made? Can you share?
noI can’t run on CPU, something need cuda and output error
if you want to run the demo on cpu, you need to use float32 and init all param on cpu.
1. change demo.py
# model = model_cls.from_config(model_config).to('cuda:{}'.format(args.gpu_id)).
model = model_cls.from_config(model_config).to('cpu')
# chat = Chat(model, vis_processor, device='cuda:{}'.format(args.gpu_id))
chat = Chat(model, vis_processor, device='cpu')
2. change minigpt4.yaml
# vit_precision: "fp16"
vit_precision: "fp32"
3. change minigpt4_eval.yaml
# low_resource: True
low_resource: False
4. change mini_gpt4.py, about line 90:
# torch_dtype=torch.float16,
torch_dtype=torch.float32
that is all you need to do to use cpu run the demo.py, but the speed is very slow
@liyaozong1991 thanks, I'll try this.