recurrent-batch-normalization-pytorch
recurrent-batch-normalization-pytorch copied to clipboard
Why do you NOT use packed padding?
Why do you NOT use packed padding but instead used masks?
Hi, as per my understanding, the most common usage of the packed sequence is to give it as input to pre-defined RNN modules (e.g. torch.nn.LSTM, torch.nn.GRU, ...). However the batch-normalized RNN requires modification of the computation of recurrent components, thus I thought that there's no advantage of using packed sequences instead of masks. Also, at the time of implementation, pack_padded_sequence couldn't accept unsorted sequences (i.e. the input must be sorted by lengths), which was thought to introduce another complication.