transformer-tensorflow positional encoding seems different from the paper

positional encoding seems different from the paper

Open tnq177 opened this issue 7 years ago • 0 comments

In the paper, it says:

PE(pos,2i)=sin(pos/10000 ** (2i/dmodel)) PE(pos,2i+1)=cos(pos/10000 ** (2i/dmode)l)

So basically for dim i, the denominator should be 10000 ** (2 * (i//2)/dmodel).

I rewrite the function as:

def get_positional_encoding(dim, sentence_length, dtype=tf.float32):
    div_term = numpy.power(10000.0, - (numpy.arange(dim)//2).astype(numpy.float32) * 2.0 / dim)
    div_term = div_term.reshape(1, -1)
    pos = numpy.arange(sentence_length, dtype=numpy.float32).reshape(-1, 1)
    encoded_vec = numpy.matmul(pos, div_term)
    encoded_vec[:, 0::2] = numpy.sin(encoded_vec[:, 0::2])
    encoded_vec[:, 1::2] = numpy.cos(encoded_vec[:, 1::2])

    return tf.convert_to_tensor(encoded_vec.reshape([sentence_length, dim]), dtype=dtype)

Jul 24 '18 20:07 tnq177

transformer-tensorflow transformer-tensorflow copied to clipboard

positional encoding seems different from the paper

transformer-tensorflow
transformer-tensorflow copied to clipboard