pytorch-seq2seq Tutorial 1: Differences between Encoder/Decoder in Seq2Seq Model

Tutorial 1: Differences between Encoder/Decoder in Seq2Seq Model

Open michael-camilleri opened this issue 3 years ago • 3 comments

Why is it that for the Encoder we can pass in the whole sentence and its forward pass does the reccurrence for us, but we have to do the same in a loop for the Decoder?

Jul 09 '21 13:07 michael-camilleri

For the encoder, we already have the entire source (input) sequence already. However, we don't have the entire target (output) sequence -- actually, we do when training, but we wouldn't when doing inference -- so we need to generate it one step at a time in a loop.

Jul 13 '21 13:07 bentrevett

Hello, I also have a question in Tutorial 1: For the input of the Encoder. Is it okay to change the dim from [seq_len, batch_size] to [batch_size, seq_len]. Since in Pytorch's CNN, I saw batch_size always comes as the 1st dim. Do we also need to alter the Embedding layer accordingly?

Sep 03 '21 15:09 iSams123

The embedding layer you don't need to do anything with, however for RNN models, such as the LSTM, if you want the batch dimension first then you need to initialize them with batch_first=True.

Sep 09 '21 12:09 bentrevett

pytorch-seq2seq pytorch-seq2seq copied to clipboard

Tutorial 1: Differences between Encoder/Decoder in Seq2Seq Model

pytorch-seq2seq
pytorch-seq2seq copied to clipboard