hierarchical-attention-networks Is the embedding initialized with a pre-trained one?

Is the embedding initialized with a pre-trained one?

Open acadTags opened this issue 6 years ago • 2 comments

From the code it seems the embedding is not initialized with a pre-trained embedding (i.e. word2vec), although in the paper it says so. Am I right or I missed something? Many thanks!

relevant code in _init_embedding

def _init_embedding(self, scope): #seems did not using word embedding with tf.variable_scope(scope): with tf.variable_scope("embedding") as scope: self.embedding_matrix = tf.get_variable( name="embedding_matrix", shape=[self.vocab_size, self.embedding_size], initializer=layers.xavier_initializer(), dtype=tf.float32) self.inputs_embedded = tf.nn.embedding_lookup( self.embedding_matrix, self.inputs)

Mar 21 '18 15:03 acadTags

The section about word embedding in the paper's paragraph 2.2:

Note that we directly use word embeddings. For a more complete model we could use a GRU to get word vectors directly from characters, similarly to (Ling et al., 2015). We omitted this for simplicity.

So it seem this implementation tries to train word embeddings that are specific to this task, except that the code trains on a one hot representation of the words, instead of the GRU character level representation mentioned here-above. Since the performances are lower than in the original paper, it seems that word2vec embeddings are better than the learned embeddings. I'm currently changing the code so that it supports word2vec plugged embeddings and I'll make a pull request soon.

Apr 19 '18 14:04 ghazi-f

In my experience with other language-related tasks, using pretrained embeddings doesn't make a lot of difference when dataset is sufficiently large, although I suspect it is very task and corpus-dependant.

@Sora77 would appreciate the PR!

Apr 24 '18 20:04 ematvey

hierarchical-attention-networks hierarchical-attention-networks copied to clipboard

Is the embedding initialized with a pre-trained one?

hierarchical-attention-networks
hierarchical-attention-networks copied to clipboard