grid-lstm icon indicating copy to clipboard operation
grid-lstm copied to clipboard

LSTM hidden layer computation

Open christopher5106 opened this issue 9 years ago • 2 comments

Hi, I'm just wondering why you use this form in https://github.com/coreylynch/grid-lstm/blob/master/model/GridLSTM.lua#L31

local next_h = nn.CMulTable()({out_gate, nn.Tanh()(next_c)})

in the paper it is

local next_h = nn.Tanh()(nn.CMulTable()({out_gate, next_c}))

Thank you for your response

christopher5106 avatar Sep 21 '16 21:09 christopher5106

I also have a question about weight sharing: for the time LSTM in your example, weights are not shared between layers (they are shared in time only, thanks to clones) while for the depth LSTM, weights are shared between layers and time. This makes a lot a sense, in fact.

But it surprised me at first read because the "tied N-LSTM" is, by definition, sharing weight along all dimensions.

Either

  1. NOT cloning weights of the depth LSTM in times, or
  2. share also the weights of the time LSTM in depth would be more coherent... do you have any idea also ?

Thanks,

christopher5106 avatar Sep 23 '16 14:09 christopher5106

I think the paper said the same weight for the time and depth of LSTM. You can refer to the paper 4.3.

ytoon avatar Dec 16 '16 06:12 ytoon