practical-pytorch question about decoder inputs from last encoder hidden state in batch translation example

question about decoder inputs from last encoder hidden state in batch translation example

Open fancyerii opened this issue 7 years ago • 1 comments

In the batch translation example, encoder is:

class EncoderRNN(nn.Module):
    def __init__(self, input_size, hidden_size, n_layers=1, dropout=0.1):
        super(EncoderRNN, self).__init__()
        
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.n_layers = n_layers
        self.dropout = dropout
        
        self.embedding = nn.Embedding(input_size, hidden_size)
        self.gru = nn.GRU(hidden_size, hidden_size, n_layers, dropout=self.dropout, bidirectional=True)
        
    def forward(self, input_seqs, input_lengths, hidden=None):
        embedded = self.embedding(input_seqs)
        packed = torch.nn.utils.rnn.pack_padded_sequence(embedded, input_lengths)
        outputs, hidden = self.gru(packed, hidden)
        outputs, output_lengths = torch.nn.utils.rnn.pad_packed_sequence(outputs)
        outputs = outputs[:, :, :self.hidden_size] + outputs[:, : ,self.hidden_size:]
        return outputs, hidden

encoder = EncoderRNN(input_lang.n_words, hidden_size, n_layers, dropout=dropout)

encoder is a two layer bidirectional GRU.

outputs, hidden = self.gru(packed, hidden)

from pytorch doc: h_n (num_layers * num_directions, batch, hidden_size)

and https://discuss.pytorch.org/t/how-can-i-know-which-part-of-h-n-of-bidirectional-rnn-is-for-backward-process/3883

In my view, hidden=(last_hidden_of_layer0_forward, first_hidden_of_layer0_backward, last_hidden_of_layer1_forward, first_hidden_of_layer1_backward)

But decoder_hidden's initial value seems wrong:

decoder_hidden = encoder_hidden[:decoder.n_layers] # Use last (forward) hidden state from encoder

As the comment we use last(forward) hidden state of encoder. we should use:

decoder_hidden = encoder_hidden[::2] # econder_hidden[0,2] is (last_hidden_of_layer0_forward,last_hidden_of_layer1_forward)

Apr 05 '18 09:04 fancyerii

encoder_hidden[:decoder.n_layers] If decoder.n_layer=1, encoder_hidden[:1] is the first hidden state. not the LAST hidden state, right?

Aug 22 '20 10:08 JingxinLee

practical-pytorch practical-pytorch copied to clipboard

question about decoder inputs from last encoder hidden state in batch translation example

practical-pytorch
practical-pytorch copied to clipboard