hierarchical-attention-networks Embeddings for special tokens/padding?

Embeddings for special tokens/padding?

Open Kreiswolke opened this issue 7 years ago • 0 comments

I was wondering where in the code you are initializing the embeddings for the special tokens in the vocabulary (like unknown and padding words) - shouldn't these be set to zero-embeddings and excluded from training? Or how are your dealing with these?

Aug 09 '18 08:08 Kreiswolke

hierarchical-attention-networks hierarchical-attention-networks copied to clipboard

Embeddings for special tokens/padding?

hierarchical-attention-networks
hierarchical-attention-networks copied to clipboard