Hongji Zhu comments

Results 64 comments of


                                            Hongji Zhu

GGUF versions doesn't seem to run on llama.cpp (through LocalAI)

Our modified llama.cpp have not been merged into official llama.cpp, please try on this [PR](https://github.com/ggerganov/llama.cpp/pull/6919)

int4和bffloat16推理时间问题（着急）

When use int4 model, you should remove the torch_dtype=torch.float16 from AutoModelForCausalLM.from_pretrained(). And it's better to run fast with [vllm](https://github.com/vllm-project/vllm), which support to run MiniCPM already.

RuntimeError: shape mismatch: value tensor of shape [1037] cannot be broadcast to indexing result of shape [1036]

Please give the running code and image

package versions have conflicting dependencies.

Thanks for you feedback，fixed now.

Questions about resource configuration

MiniCPM-Llama3-V 2.5 need at least 17GB GPU memory， NVIDIA RTX 3090 24GB is ok. The int4 version need 9GB GPU memory.

when can we get the code of train from scratch?

Thanks for your attention. Since the training code and data are deeply tied to our internal infrastructure, we do not intend to open source this part. Please refer to our...

生成错误描述

感谢反馈，麻烦提供下运行系统，硬件以及具体的prompt，以便我们复现

VisionEncoder里面的vit，用的是idefics2还是 hf4m那个啊？

idefics2

Deploy MiniCPM-V 2.5 with vllm

Thanks for your attention，we will support to deploy the MiniCPM-V 2.5 with vllm soon.

I want to know the use of the gpu of MiniCPM-Llama3-V 2.5

Inference with MiniCPM-Llama3-V 2.5 fp16 need at least 16 GB gpu memory, int4 need 8 GB gpu memory. Full parameter training need 8*A100 80GB, we will release lora fine-tuning code...