Cui Junbo
Cui Junbo
@qyc-98
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/discussions/2#664c8889afe6e8c3a91aea25 Here's how it's handled.
您好,可尝试更直观的prompt,如通过给出全部输出的格式来使模型输出有参考。或者参考我们的评测代码 https://github.com/OpenBMB/MiniCPM-V/blob/main/eval_mm/vlmevalkit/vlmeval/vlm/minicpm_llama3_v_2_5.py#L63
Hi @pzc163, Thank you for sharing this important information with us. We are deeply shocked and will be paying special attention to this matter. We will immediately launch an investigation...
The conclusion of our investigation: - Llama3-V can be run using MiniCPM-Llama3-V 2.5's code and config.json after changing param names - It behaves similarly to MiniCPM-Llama3-V 2.5 in unrevealed experimental...
try our new model~
please try our new finetuning code
Yes, both text and images are computed as inp length, and you can try resizing the frames to be smaller before inference(448\*448 = 64tokens and 1344\*1344 = 640tokens), or modifying...
非常欢迎您尝试修改, 这个是 minicpm-v 2.0 存在的一个问题, 我们已经对v2以后的模型进行了修改 使用的是 huggingfaceM4实现的siglip 并且支持batch推理的~