athena icon indicating copy to clipboard operation
athena copied to clipboard

how to select scale of position encoding ,when use scale,when not use

Open l2009312042 opened this issue 4 years ago • 0 comments

i read the positon encoding code found that def call(self, x): """ call function """ seq_len = tf.shape(x)[1] if self.scale: x *= tf.math.sqrt(tf.cast(self.d_model, tf.float32)) x += self.pos_encoding[:, :seq_len, :] return x

my question is when to use the scale ,when not use ? is there any experimental result or theory to direct the seleciton?

l2009312042 avatar Jan 21 '21 03:01 l2009312042