Demi LuLu comments

Results 3 comments of


                                            Demi LuLu

TypeError: not a string

I forgot to download the tokenizer.model from huggingface by git clone by lfs mode.

Hi， is there a bug in Video-LLaVA-main/videollava/model/multimodal_encoder/builder.py?

I have the same problem in local computer, but it works in https://colab.research.google.com/. error like: RuntimeError: Error(s) in loading state_dict for CLIPVisionModel: size mismatch for vision_model.embeddings.class_embedding: copying a param with...

Hi， is there a bug in Video-LLaVA-main/videollava/model/multimodal_encoder/builder.py?

config file: "intermediate_size": 11008, "max_position_embeddings": 4096, "mm_hidden_size": 1024, "mm_image_tower": "/home/demi/model_lib/LanguageBind_Image", "mm_projector_type": "mlp2x_gelu", "mm_use_x_patch_token": false, "mm_use_x_start_end": false, "mm_video_tower": "/home/demi/model_lib/LanguageBind_Video_merge", "mm_vision_select_feature": "patch", "mm_vision_select_layer": -2, "model_type": "llava", "num_attention_heads": 32, ![image](https://github.com/PKU-YuanGroup/Video-LLaVA/assets/142146484/79332d9a-0045-4577-964d-ae0340f59752)