annotated-transformer
annotated-transformer copied to clipboard
Typo in Multihead-attention:
In the MultiheadAttention the line
self.linears = clones(nn.Linear(d_model, d_model), 4)
occurs, but it should be a 3 instead of a 4:
self.linears = clones(nn.Linear(d_model, d_model), 3)
am I correct?
shared credits to @samiede
@Jostarndt, perhaps not, since you need 4 self.linears:
- linear_Q (to project query)
- linear_K (to project key)
- linear_V (to project value)
- linear_O (to project output)
It's true that the fourth linear is a bit more hidden, since it is located on the very top after concat as seen in the image below 😅
I was looking into the issues section with the same doubt in mind and @stanwinata is right. The code is correct, just a bit obscure (I spent half an hour trying to figure it out).
When you use zip
, the generator (here I mean the zip
Python generator) stops with the shortest iterable. Therefore, if you zip
two iterables, you will get as many items as the shortest iterable. For instance:
long_iterable = [1, 2, 3, 4] short_iterable = ['q', 'k', 'v'] for a,b in zip(long_iterable, short_iterable): ... print(a, b) ... 1 q 2 k 3 v
Yes, I forgot to finally reply to already the first answer! You are so right, thanks for pointing out 4.!