annotated-transformer Typo in Multihead-attention:

Typo in Multihead-attention:

Open Jostarndt opened this issue 2 years ago • 2 comments

In the MultiheadAttention the line

self.linears = clones(nn.Linear(d_model, d_model), 4)

occurs, but it should be a 3 instead of a 4:

self.linears = clones(nn.Linear(d_model, d_model), 3)

am I correct?

Aug 25 '22 15:08 Jostarndt

shared credits to @samiede

Aug 25 '22 15:08 Jostarndt

@Jostarndt, perhaps not, since you need 4 self.linears:

linear_Q (to project query)
linear_K (to project key)
linear_V (to project value)
linear_O (to project output)

It's true that the fourth linear is a bit more hidden, since it is located on the very top after concat as seen in the image below 😅 Screen Shot 2022-09-05 at 4 23 20 PM

Sep 05 '22 23:09 stanwinata

I was looking into the issues section with the same doubt in mind and @stanwinata is right. The code is correct, just a bit obscure (I spent half an hour trying to figure it out).

When you use zip, the generator (here I mean the zip Python generator) stops with the shortest iterable. Therefore, if you zip two iterables, you will get as many items as the shortest iterable. For instance:

long_iterable = [1, 2, 3, 4] short_iterable = ['q', 'k', 'v'] for a,b in zip(long_iterable, short_iterable): ... print(a, b) ... 1 q 2 k 3 v

Sep 06 '23 13:09 ljmanso

Yes, I forgot to finally reply to already the first answer! You are so right, thanks for pointing out 4.!

Sep 06 '23 14:09 Jostarndt

annotated-transformer annotated-transformer copied to clipboard

Typo in Multihead-attention:

annotated-transformer
annotated-transformer copied to clipboard