pytorch-stable-diffusion why do we need set causal

why do we need set causal_mask =True in clip

Open liuxiaoqun opened this issue 1 year ago • 2 comments

prompt is a sentence ,we don't need to predict next token in prompt, is there a question to see the right tokens? x = self.attention(x, causal_mask=True)

Mar 12 '24 15:03 liuxiaoqun

This my question as well.

Mar 21 '24 06:03 RohollahHS

I had the same query. This is the answer I found in the CLIP paper by OpenAI:

"Masked self-attention was used in the text encoder to preserve the ability to initialize with a pre-trained language model or add language modeling as an auxiliary objective, though exploration of this is left as future work."

Paper: https://arxiv.org/pdf/2103.00020

Jul 21 '24 02:07 ninaddaithankar

pytorch-stable-diffusion pytorch-stable-diffusion copied to clipboard

why do we need set causal_mask =True in clip

pytorch-stable-diffusion
pytorch-stable-diffusion copied to clipboard