treelstm.pytorch
treelstm.pytorch copied to clipboard
Why zero out embeddings for special words if they are absent in vocab
Hi,
I noticed that in main.py
, you zero out the embeddings for special words if they are absent in vocabulary:
# zero out the embeddings for padding and other special words if they are absent in vocab
for idx, item in enumerate([Constants.PAD_WORD, Constants.UNK_WORD, Constants.BOS_WORD, Constants.EOS_WORD]):
emb[idx].zero_()
Is there any reason for doing so? Why not using random normal vectors?
Thanks.
Hi @Silenthinker
As far as I remember, when initialising the embeddings, I realised that the PAD_WORD
needs to be zeroed out. At the time, I was unsure what to do with the other special words, and left them as zero-ed out to start with. I believe you can try initializing them normally, it should be fine.
Do let me know if you get a chance to try out random normal initialization!
Thanks for your reply. I'll try it out.
However, it seems unclear to me what the role of PAD_WORD
is since I didn't find anywhere it is used for padding sentences. Did I miss it somewhere?
Thanks.