LLaVA-NeXT
LLaVA-NeXT copied to clipboard
inference with LLM and vision frozen
Dear all, I have fine-tuned a LLaVA-OneVision (0.5B and 7B) with LLM and vision components frozen. The checkpoint output directory contains these files:
runs
checkpoint-1000
...
...
checkpoint-22000
trainer_state.json
mm_projector.bin
config.json
Loading the trained model with load_pretrained_model(model_path, model_base=None, model_name='llava_qwen', device_map='auto') fail as no tokenizer files are found in the model_path. When I copy tokenizer files (tokenizer_config.json, tokenizer.json) from the base model, loading fails with this error: OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory.
I had a look at the load_pretrained_model method in the builder.py file. It seems that I should set model_base to base model (e.g. lmms-lab/llava-onevision-qwen2-0.5b-ov) rather than setting to None. Also, it seems that some logic to load qwen model is missing in the method.
I tried to add this code:
elif "qwen" in model_name.lower():
from llava.model.language_model.llava_qwen import LlavaQwenConfig, LlavaQwenForCausalLM
tokenizer = AutoTokenizer.from_pretrained(model_base, use_fast=False)
if overwrite_config is not None:
llava_cfg = LlavaQwenConfig.from_pretrained(model_path)
rank0_print(f"Overwriting config with {overwrite_config}")
for k, v in overwrite_config.items():
setattr(llava_cfg, k, v)
model = LlavaQwenForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, attn_implementation=attn_implementation, config=llava_cfg, **kwargs)
else:
model = LlavaQwenForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, attn_implementation=attn_implementation, **kwargs)
right after:
elif model_base is not None:
...
...
elif (
"wizardlm-2" in model_name.lower()
and "vicuna" in model_name.lower()
or "llama" in model_name.lower()
or "yi" in model_name.lower()
or "nous-hermes" in model_name.lower()
or "llava-v1.6-34b" in model_name.lower()
or "llava-v1.5" in model_name.lower()
):
....
....
model = LlavaLlamaForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, config=llava_cfg, **kwargs)
[ADD CODE HERE]
I managed to load the model with this fix. Can you please confirm if this is correct or if I am doing something wrong? Thanks a lot for your help.