BERT-pytorch PositionalEmbedding

PositionalEmbedding

Open fgaoyang opened this issue 6 years ago • 3 comments

The position embedding in the BERT is not the same as in the transformer. Why not use the form in bert?

Jan 15 '19 06:01 fgaoyang

@Yang92to Great Point, I'll check out the BERT positional embedding method, and update ASAP

Apr 08 '19 13:04 codertimo

@codertimo the BERT positional embedding method is to just learn an embedding for each position. So you can use nn.Embedding with a constant input sequence [0,1,2,...,L-1] where L is the maximum sequence length.

Nov 26 '19 18:11 jacklanchantin

@codertimo Since BERT uses learned positional embeddings and it is one of the biggest difference between original transformers and BERT, I think it is quite urgent to modify the positional embedding part.

Sep 21 '20 03:09 yonghee12

BERT-pytorch BERT-pytorch copied to clipboard

PositionalEmbedding

BERT-pytorch
BERT-pytorch copied to clipboard