Dimension of subsequent layers in Hypernetwork
Hi, I was reading through your implementation of HyperLSTM and the associated paper. I got lost in the shaping of the layers after the first layer. Could you please explain why the input size is 2*main_lstm_hidden_size?
Unspoken Felt
Sorry for the very late reply. I'm not sure what you are referring to exactly, could you please point to a line or a section of code please?
Lines 221-223 in class HyperLSTM state:
self.cells = nn.ModuleList([HyperLSTMCell(input_size, hidden_size, hyper_size, n_z)] +
[HyperLSTMCell(hidden_size, hidden_size, hyper_size, n_z) for _ in
range(n_layers - 1)])
This chunk calls Line 120 in the initialisation function.
self.hyper = LSTMCell(hidden_size + input_size, hyper_size, layer_norm=True)
Thus, the first layer created by the code chunk is
LSTMCell(hidden_size + input_size, hyper_size, layer_norm=True)
Then the next layer is:
LSTMCell(hidden_size + hidden_size, hyper_size, layer_norm=True) <-----I am confused about the 2*hidden_size dimension here.