Positional-Encoding icon indicating copy to clipboard operation
Positional-Encoding copied to clipboard

Some confuse about position encoder

Open GuoShi28 opened this issue 5 years ago • 1 comments

  1. PE(pos, 2i) = sin(pos / 10000 ^ (2i/dim)), Why the parameter is set as 10000? Does this have some meaning for this task.

  2. I do not fully understand the position encoder. Why the sin and cos functions are alternant utilized?

Looking forward to your reply. Thank you.

GuoShi28 avatar Apr 19 '19 08:04 GuoShi28

As far as I know, the Positional Encoding is a method that is used for the transformer model. Basically, the sin(pos / 10000 ^ (2i/dim)) is a sinusoidal function, which is a function that is like a sine function in the sense that the function can be produced by shifting, stretching or compressing the sine function.

According to the "Attention is all you need" paper, they chose the sinusoid function because they hypothesized it would allow the model to easily learn to attend by relative positions, since for any fixed offset k, P E_(pos+k) can be represented as a linear function of P E_pos.

You could find this in the section 3.5 of the "Attention is all you need" paper. https://arxiv.org/pdf/1706.03762.pdf

YeonwooSung avatar Jun 27 '20 07:06 YeonwooSung