pytorch-sentiment-analysis icon indicating copy to clipboard operation
pytorch-sentiment-analysis copied to clipboard

Updated Sentiment Analysis : what's the impact of not using packed_padded_sequence()?

Open githubrandomuser2017 opened this issue 5 years ago • 1 comments

Thanks for your awesome tutorials. In the one for "Updated Sentiment Analysis", you wrote the following:

Without packed padded sequences, hidden and cell are tensors from the last element in the sequence, which will most probably be a pad token, however when using packed padded sequences they are both from the last non-padded element in the sequence.

What does this mean exactly? If I'm using an LSTM, the final hidden state is an ongoing representation of the sequence up to and including the last token. If the last few tokens are <PAD>, would that matter since the hidden state already captured the previous non-<PAD> tokens?

githubrandomuser2017 avatar Aug 23 '20 05:08 githubrandomuser2017

In theory, it wouldn't matter as your RNN should learn to ignore the pad tokens and not update its internal hidden state if it sees a <pad> token. However, your RNN has to explicitly learn that. It starts off with no prior knowledge that <pad> tokens do not contain any information. Thus, by using packed padded sequences we avoid that altogether. Your model doesn't have to learn to ignore <pad> tokens as it never sees them in the first place.

bentrevett avatar Aug 25 '20 16:08 bentrevett