awd-lstm-lm icon indicating copy to clipboard operation
awd-lstm-lm copied to clipboard

Why is the decoder using nhid as input size even when tie_weights is set at True ?

Open FrancoisMentec opened this issue 6 years ago • 0 comments

I never used torch, but if I understand your code correctly the last LSTM layer's hidden size is equal to the first layer input size when tie_weight is true. But the decoder always take the hidden size as input size : LSTM Layers self.rnns = [torch.nn.LSTM(ninp if l == 0 else nhid, nhid if l != nlayers - 1 else (ninp if tie_weights else nhid), 1, dropout=0) for l in range(nlayers)] Decoder self.decoder = nn.Linear(nhid, ntoken)

There is a commented raise ValueError in the case nhid is different of ninp when using tie_weights :

if tie_weights:
            #if nhid != ninp:
            #    raise ValueError('When using the tied flag, nhid must be equal to emsize')
            self.decoder.weight = self.encoder.weight

So when using tie_weight ninp should be equals to nhid ? I don't understand why there is this restriction instead of just using ninp as the input size of the decoder when using tie_weights.

I hope you will clarify this for me.

FrancoisMentec avatar Jul 04 '18 09:07 FrancoisMentec