pytorch-seq2seq Tutorial 4: Decoder - the calculation of prediction

Tutorial 4: Decoder - the calculation of prediction

Open actforjason opened this issue 3 years ago • 2 comments

Why use torch.cat((output, weighted, embedded), dim=1)？ Usually，Isn't just using the output enough？

        embedded = embedded.squeeze(0)
        output = output.squeeze(0)
        weighted = weighted.squeeze(0)
        
        prediction = self.fc_out(torch.cat((output, weighted, embedded), dim=1))

Mar 10 '21 06:03 actforjason

We could just use output, but the notebook is replicating this paper which calculates the prediction using: the decoder hidden state (output), the attention weighted context (weighted) and the current input word (embedded) - see appendix section 2.2.

Maybe output is enough in this case. Feel free to try it and let me know if the results are any different.

Mar 10 '21 14:03 bentrevett

We could just use output, but the notebook is replicating this paper which calculates the prediction using: the decoder hidden state (output), the attention weighted context (weighted) and the current input word (embedded) - see appendix section 2.2.

Maybe output is enough in this case. Feel free to try it and let me know if the results are any different.

thank you, I got it.

Mar 10 '21 15:03 actforjason

pytorch-seq2seq pytorch-seq2seq copied to clipboard

Tutorial 4: Decoder - the calculation of prediction

pytorch-seq2seq
pytorch-seq2seq copied to clipboard