SmartEdit
SmartEdit copied to clipboard
Qformer mm_projector issue
Hello, thanks for ur amazing work! I have a problem when running this code, could u help me to solve it?
When I train the script DS_MLLMSD11_train.py, I encountered this error.
File "/SmartEdit/model/DS_MLLMSD11_model.py", line 243, in load_pretrain_MLLM_alignment
mm_projector_param = {'weight': weights.pop('mm_projector.weight'), 'bias': weights.pop('mm_projector.bias')}
KeyError: 'mm_projector.weight'
The directory of the SD_QFormer_conversation_33tokens is: /SmartEdit/checkpoints/stage1_CC12M_alignment_7b/embeddings_qformer/checkpoint-150000.bin
In addition, I run the stage1 inference code successfully.
The qformer model trained in the first stage is a 6 block bert-based model. I print the keys in the model weight dict, it seems that the model doesn't contain "mm_projector" item.
And there is another question that I'm confused about, I think the "mm_projector" module only contains in the llava module, its functionality is to convert the image embedding(using vit) in the image latent space into the text latent space. I have no idea why qformer module needs mm_projector module. I think these two are completely different things.