LLaVA-NeXT
LLaVA-NeXT copied to clipboard
Missing mm_projector in latest LLaVa
I am attempting to run the finetune_onevision.sh script. I've gotten many things sorted out but I am stumped by the --pretrain_mm_mlp_adapter argument.
The default value as provided in the script is ./checkpoints/projectors/llavanext-openai_clip-vit-large-patch14-336-Qwen_Qwen2-7B-Instruct-mlp2x_gelu-pretrain_blip558k_plain/mm_projector.bin after expanding the environment variables. I made sure that directory exists but I do not know where to find mm_projector.bin for the newest LLaVa. I have found an issue and discussion regarding this parameter for the previous version of LLaVa, e.g. https://huggingface.co/liuhaotian/llava-v1.5-13b/blob/main/mm_projector.bin
I have also looked for some kind of extract_projector script but that does not seem to exist.
This seems to be something rather important and I cannot find any documentation about it all, apart from the aforementioned Github issues for LLaVa 1.5, even after scouring the web with Google and DuckDuckGo.
I am currently attempting to use the mm_projector.bin downloaded from the link above, from the LLaVa 1.5 liuhaotian archive. Update: this has resulted in a series of size/shape mismatch type errors (not surprisingly, really), e.g. size mismatch for 0.weight: copying a param with shape torch.Size([5120, 1024]) from checkpoint, the shape in current model is torch.Size([3584, 1152]).
Please advise.