annotated-transformer icon indicating copy to clipboard operation
annotated-transformer copied to clipboard

MultiHeadedAttention: affine transforms

Open axelbr opened this issue 3 years ago • 0 comments

First of all: thank you for this work, it is really easy to follow along this notebook.

My question is the following: In the MultiHeadedAttention class, you instantiate 4 affine layers instead of 4 linear ones (bias is True by default). Is this on purpose? Then the text should be updated, as there are only the 4 matrices mentioned.

axelbr avatar Jun 08 '22 11:06 axelbr