recurrent-batch-normalization-pytorch
recurrent-batch-normalization-pytorch copied to clipboard
Is dropout really applied?
Hi, thanks for sharing the code, I have a question regarding dropout, hope it's not stupid. Here in the code
for layer in range(self.num_layers):
cell = self.get_cell(layer)
hx_layer = (hx[0][layer,:,:], hx[1][layer,:,:])
if layer == 0:
layer_output, (layer_h_n, layer_c_n) = LSTM._forward_rnn(
cell=cell, input_=input_, length=length, hx=hx_layer)
else:
layer_output, (layer_h_n, layer_c_n) = LSTM._forward_rnn(
cell=cell, input_=layer_output, length=length, hx=hx_layer)
input_ = self.dropout_layer(layer_output)
It seems to me the correct dropout should be assigned to layer_output? Because input_ is not used after layer == 0.
It seems like an oversight. Thanks! I have a plan to refactor the entire code when I have some time, including what you've pointed out and other suggestions. (Or it would be really grateful if you send this repo a PR!)