CLIP
CLIP copied to clipboard
How is the text encoder initialized?
The paper mentions that the text encoder is a Transformer with the architecture modifications from GPT-2. My question is: is the text encoder trained from scratch or is it initialized by the learned parameters from GPT-2?
Thanks in advance for your answer.