SmartEdit icon indicating copy to clipboard operation
SmartEdit copied to clipboard

Qformer mm_projector issue

Open zjutkarma opened this issue 5 months ago • 2 comments

Hello, thanks for ur amazing work! I have a problem when running this code, could u help me to solve it?

When I train the script DS_MLLMSD11_train.py, I encountered this error.

  File "/SmartEdit/model/DS_MLLMSD11_model.py", line 243, in load_pretrain_MLLM_alignment
    mm_projector_param = {'weight': weights.pop('mm_projector.weight'), 'bias': weights.pop('mm_projector.bias')}
KeyError: 'mm_projector.weight'

The directory of the SD_QFormer_conversation_33tokens is: /SmartEdit/checkpoints/stage1_CC12M_alignment_7b/embeddings_qformer/checkpoint-150000.bin

In addition, I run the stage1 inference code successfully.

The qformer model trained in the first stage is a 6 block bert-based model. I print the keys in the model weight dict, it seems that the model doesn't contain "mm_projector" item.

And there is another question that I'm confused about, I think the "mm_projector" module only contains in the llava module, its functionality is to convert the image embedding(using vit) in the image latent space into the text latent space. I have no idea why qformer module needs mm_projector module. I think these two are completely different things.

zjutkarma avatar Sep 24 '24 11:09 zjutkarma