transformer-tensorflow
transformer-tensorflow copied to clipboard
positional encoding seems different from the paper
In the paper, it says:
PE(pos,2i)=sin(pos/10000 ** (2i/dmodel)) PE(pos,2i+1)=cos(pos/10000 ** (2i/dmode)l)
So basically for dim i, the denominator should be 10000 ** (2 * (i//2)/dmodel).
I rewrite the function as:
def get_positional_encoding(dim, sentence_length, dtype=tf.float32):
div_term = numpy.power(10000.0, - (numpy.arange(dim)//2).astype(numpy.float32) * 2.0 / dim)
div_term = div_term.reshape(1, -1)
pos = numpy.arange(sentence_length, dtype=numpy.float32).reshape(-1, 1)
encoded_vec = numpy.matmul(pos, div_term)
encoded_vec[:, 0::2] = numpy.sin(encoded_vec[:, 0::2])
encoded_vec[:, 1::2] = numpy.cos(encoded_vec[:, 1::2])
return tf.convert_to_tensor(encoded_vec.reshape([sentence_length, dim]), dtype=dtype)