pytorch-seq2seq
pytorch-seq2seq copied to clipboard
Tutorial 1: Differences between Encoder/Decoder in Seq2Seq Model
Why is it that for the Encoder we can pass in the whole sentence and its forward pass does the reccurrence for us, but we have to do the same in a loop for the Decoder?
For the encoder, we already have the entire source (input) sequence already. However, we don't have the entire target (output) sequence -- actually, we do when training, but we wouldn't when doing inference -- so we need to generate it one step at a time in a loop.
Hello, I also have a question in Tutorial 1: For the input of the Encoder. Is it okay to change the dim from [seq_len, batch_size] to [batch_size, seq_len]. Since in Pytorch's CNN, I saw batch_size always comes as the 1st dim. Do we also need to alter the Embedding layer accordingly?
The embedding layer you don't need to do anything with, however for RNN models, such as the LSTM, if you want the batch dimension first then you need to initialize them with batch_first=True
.