minbert-assignment
minbert-assignment copied to clipboard
A little confusion about the positional encoding.
Sir, I am a little bit confused about the positional encoding part in the bert.py file. Can you please explain that?
As described in the transformer paper, the positional encoding is some sin, cos function of the positions. But here, you used nn.Embedding layer. why is this used for positional encoding?