nmt default embedding

default embedding

Open nashid opened this issue 5 years ago • 5 comments

trafficstars

If we do not provide embedding like word2vec, how does it know to represent the words?

Does it use one hot encoding by default or ngram, CBOW, skip grams?

May 21 '20 01:05 nashid

No. If you do not provide the pretrained embeddings, it will create an trainable variable, and initialize it by some algorithm. When you train the model on your data, this variable will be updated too.

Jun 06 '20 03:06 luozhouyang

@luozhouyang I understand if we do not provide the pre-trained embedding it uses the default implementation of embedding in this framework.

However, I would like to know what algorithm is used to build the embedding.

Jun 07 '20 23:06 nashid

Word embeddings here are actually an 2-d tensor, with shape (vocab_size, embedding_size). This tensor will be updated along with other params by BP.

Jun 08 '20 00:06 luozhouyang

@luozhouyang I understand this. But what algorithm it is using (like word2vec, gloVe, ...)?

Jul 16 '21 19:07 nashid

No special algorithm is used. Not word2vec, not GloVe, just a learnable 2-d matrix.

Jul 19 '21 01:07 luozhouyang

nmt nmt copied to clipboard

default embedding

nmt
nmt copied to clipboard