MiniGPT-4 icon indicating copy to clipboard operation
MiniGPT-4 copied to clipboard

The order of connection between word vectors and image vectors in prompt

Open Edisonhimself opened this issue 9 months ago • 0 comments

The order of connection between word vectors and image vectors in prompt Thank you for your work. I have a few questions regarding the concatenation order of the image vectors and text vectors in the input prompt during the training and evaluation stages of MiniGPT4.

1.It has been observed that during the evaluation stage (as seen in the demo), the input prompt has the word vectors in the front and the image feature vectors at the back. Could you please explain the reason for using this prompt order ? It seems different from the order during the second stage of training. 2.In the first training stage, you directly input the image embeddings into the large model without utilizing any prompt word vectors as assistance. How does this approach still achieve a preliminary "interaction between images and text" effect?"

Edisonhimself avatar May 20 '24 06:05 Edisonhimself