annotated_deep_learning_paper_implementations Dimension of subsequent layers in Hypernetwork

Hi, I was reading through your implementation of HyperLSTM and the associated paper. I got lost in the shaping of the layers after the first layer. Could you please explain why the input size is 2*main_lstm_hidden_size?

Feb 27 '23 23:02 Simply-Adi

Unspoken Felt

Mar 26 '23 09:03 Siimarras

Sorry for the very late reply. I'm not sure what you are referring to exactly, could you please point to a line or a section of code please?

Jun 30 '23 10:06 vpj

Lines 221-223 in class HyperLSTM state:

self.cells = nn.ModuleList([HyperLSTMCell(input_size, hidden_size, hyper_size, n_z)] +
                                  [HyperLSTMCell(hidden_size, hidden_size, hyper_size, n_z) for _ in
                                   range(n_layers - 1)])

This chunk calls Line 120 in the initialisation function.

self.hyper = LSTMCell(hidden_size + input_size, hyper_size, layer_norm=True)

Thus, the first layer created by the code chunk is

LSTMCell(hidden_size + input_size, hyper_size, layer_norm=True)

Then the next layer is:

LSTMCell(hidden_size + hidden_size, hyper_size, layer_norm=True) <-----I am confused about the 2*hidden_size dimension here.

Jun 30 '23 10:06 Simply-Adi