question of get_prompt_embedding_LLM in DS_SmartEdit_test.py
I noticed that in your code, num_new_tokens is set to 32, and in the issue, I noticed that you mentioned retaining some tokens and adding new tokens in the documentation. In the case of an image with a resolution of 512, llm_img_token_states.shape[1] is 35. Does this method have any requirements regarding the input image resolution?
instruction:What would happen if one of the tourists accidentally slips on wet rocks near the waterfall? token_id:None,LLM_tokenizer.img_start_token_id:32003 token_id:32018,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32018 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32006,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32006 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32018,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32018 token_id:32018,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32018 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32006,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32006 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32018,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32018 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 token_id:32011,LLM_tokenizer.img_start_token_id:32003 Saving LLM embeddings... 32011 llm_img_token_states.shape:torch.Size([1, 35, 5120])
Have you prepared the correct checkpoint files?
I previously downloaded LLaVA-13b-delta-v1-1 as the pretrained LLaVA weights, which caused the same issue. However, as mentioned in the LLaVA documentation, these delta weights must be merged with the original LLaMA weights. Once properly combined, the model should work as expected.