awd-lstm-lm copied to clipboard
Why is the decoder using nhid as input size even when tie_weights is set at True ?
I never used torch, but if I understand your code correctly the last LSTM layer's hidden size is equal to the first layer input size when tie_weight is true. But the decoder always take the hidden size as input size :
LSTM Layers
self.rnns = [torch.nn.LSTM(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else (ninp if tie_weights else nhid), 1, dropout=0) for l in range(nlayers)]
self.decoder = nn.Linear(nhid, ntoken)
There is a commented raise ValueError in the case nhid is different of ninp when using tie_weights :
if tie_weights:
#if nhid != ninp:
# raise ValueError('When using the tied flag, nhid must be equal to emsize')
self.decoder.weight = self.encoder.weight
So when using tie_weight ninp should be equals to nhid ? I don't understand why there is this restriction instead of just using ninp as the input size of the decoder when using tie_weights.
I hope you will clarify this for me.