stable-diffusion
stable-diffusion copied to clipboard
Math Typo in Paper?? (dimension of linear layers in Q, K, V)
I am reading the latent diffusion paper, which states for cross-attention we have a linear projection for Q K V with W_q W_k W_v.
But it then says W_v \in R^{d x d_c^i} and both W_q and W_k \in R^{d x d_r}. But shouldn't W_q be \in R^{d x d_c^i} and W_v \in R^{d x d_r}? Since K and V need to have the same dimension? By looking at the code here on github W_v and W_k are also of the same dim, but as I said not in the paper? Is this a known typo?
Paper:
Code:
https://github.com/CompVis/stable-diffusion/blob/21f890f9da3cfbeaba8e2ac3c425ee9e998d5229/ldm/modules/attention.py#L152