monney
monney
Getting the same issue
Here's my full error printout ``` [+] Creating directory: ./rips/instagram_instagram Retrieving https://www.instagram.com/instagram/ name:1:0 Expected an operand but found < ^ name:2:0 Expected an operand but found < ^ name:3:4 Expected...
Interesting, I also thought we were summing over just the kernel, as in a convolution. This answers my question from the other thread, and explains why we need to aggregate...
Hi thank you for your paper and congrats on SOTA. I have a question related to this, from the linear projection we generate an attention map for each of pixel...
@Andrew-Qibin I think the main question here is that since the attention matrix does not use similarity scores in its generation, the attention scores are only based on the central...
@toodle I think because of #7 the weights end up being based on the KxK pixels at least indirectly, aggregating from 2K-1x2K-1 surrounding pixels. It differs from traditional attention though...
@toodle I agree with you. The attention weights don’t take the similarity between the pixels into account, this to me is a key piece of attention. But the way they...
Thank you, I think these numbers are probably worth adding to the paper as well for context.
Interesting result. Here: https://github.com/google-research/google-research/issues/534#issuecomment-763805052 The author says they use hard labels for large scale tasks, and for smaller ones soft labels converge faster. But I don't know if they used...
I had read that comment previously but wasn't sure if he meant hard labels, or the repo as is, I think you're right though. I looked through this repo's code...