MiniCPM-V icon indicating copy to clipboard operation
MiniCPM-V copied to clipboard

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Results 480 MiniCPM-V issues
Sort by recently updated
recently updated
newest added

web_demo.py 运行需要10G显存。为啥 vllm 启动api模式运行,需要23G显存啊? python web_demo.py --device cuda --dtype bf16 Vllm运行命令如下 /services/srv/MiniCPM-vllm/venv/bin/python -m vllm.entrypoints.openai.api_server --model /services/srv/MiniCPM-V/openbmb/MiniCPM-V-2/ --trust-remote-code ![5fc1c797807f86d12cc37208620d731](https://github.com/OpenBMB/MiniCPM-V/assets/11950/5c79bb50-37d8-4b6a-8e72-2ec0cb19c088)

问什么识别一张普通发票很多字段都识别不出来?是提示词有讲究

请问在git主页的测试案例,如何实现模型的流式输出呢? 或者huggingface主页的推理示例如何实现流式输出,一个字一个字的输出 而非一句话一起输出

feature
llamacpp

hi, thanks for your awesome work! when will your release the tech report or docs for current work like MiniCPM-Llama3-V 2.5 and MiniCPM-V 2.0? thanks

First of all, thank you for your impressive work! I've found that your model fares better than the latest LLAVA (13B) on some of my tasks. I've tried running the...

llamacpp

用如下代码分别测试MiniCPM-2B-dpo-bf16和MiniCPM-dpo-Int4两个模型,推理时间MiniCPM-2B-dpo-bf16有3秒多,MiniCPM-dpo-Int4有10秒以上,请问原因是啥? ![image](https://github.com/OpenBMB/MiniCPM-V/assets/77612906/88b36241-c1b5-4826-b251-946161658f9d)

help wanted
inference

File "lib/python3.9/site-packages/transformers/models/idefics2/modeling_idefics2.py", line 190, in forward position_ids[batch_idx][p_attn_mask.view(-1).cpu()] = pos_ids RuntimeError: shape mismatch: value tensor of shape [1037] cannot be broadcast to indexing result of shape [1036]