CLIP_prefix_caption
CLIP_prefix_caption copied to clipboard
Unsuccessfully concatenating prefix__projection and text embedding
I have successed running the parse_coco.py file and the output of clip was stored in my data files. Then I tried to run the gpt2 fine-tune program. An error was reported as below:
It seemed that the error happened in the 234th line when I concatenated the prefix projection and the embedding text.
I printed the shape of those tensors below:
prefix-----------------(40,1,512,512)
self.clip_project(prefix)-------------------(40,1,512,7680)
prefix_projection-----------(20480,10,768)
embedding_text---------------------(40,41,768)
I thought the error was that the shape of prefix_projection should be (batch_size,prefix_length,embedding_size).
I wondered what shape of prefix should be.
Thanks