LLaVA-NeXT Missing mm_projector in latest LLaVa

Missing mm_projector in latest LLaVa

Open mrd opened this issue 1 year ago • 11 comments

I am attempting to run the finetune_onevision.sh script. I've gotten many things sorted out but I am stumped by the --pretrain_mm_mlp_adapter argument.

The default value as provided in the script is ./checkpoints/projectors/llavanext-openai_clip-vit-large-patch14-336-Qwen_Qwen2-7B-Instruct-mlp2x_gelu-pretrain_blip558k_plain/mm_projector.bin after expanding the environment variables. I made sure that directory exists but I do not know where to find mm_projector.bin for the newest LLaVa. I have found an issue and discussion regarding this parameter for the previous version of LLaVa, e.g. https://huggingface.co/liuhaotian/llava-v1.5-13b/blob/main/mm_projector.bin

I have also looked for some kind of extract_projector script but that does not seem to exist.

This seems to be something rather important and I cannot find any documentation about it all, apart from the aforementioned Github issues for LLaVa 1.5, even after scouring the web with Google and DuckDuckGo.

I am currently attempting to use the mm_projector.bin downloaded from the link above, from the LLaVa 1.5 liuhaotian archive. Update: this has resulted in a series of size/shape mismatch type errors (not surprisingly, really), e.g. size mismatch for 0.weight: copying a param with shape torch.Size([5120, 1024]) from checkpoint, the shape in current model is torch.Size([3584, 1152]).

Please advise.

Aug 14 '24 14:08 mrd

LLaVA-NeXT LLaVA-NeXT copied to clipboard

Missing mm_projector in latest LLaVa

LLaVA-NeXT
LLaVA-NeXT copied to clipboard