athena how to select scale of position encoding ，when use scale，when not use

how to select scale of position encoding ，when use scale，when not use

Open l2009312042 opened this issue 4 years ago • 0 comments

i read the positon encoding code found that def call(self, x): """ call function """ seq_len = tf.shape(x)[1] if self.scale: x *= tf.math.sqrt(tf.cast(self.d_model, tf.float32)) x += self.pos_encoding[:, :seq_len, :] return x

my question is when to use the scale ,when not use ? is there any experimental result or theory to direct the seleciton?

Jan 21 '21 03:01 l2009312042

athena athena copied to clipboard

how to select scale of position encoding ，when use scale，when not use

athena
athena copied to clipboard