las
las copied to clipboard
Regarding the class att_rnn
Hi. This is Yong Joon Lee. I am implementing LAS model based on your code. I know you might not remember the actual code cuz obviously you implemented it 3 years ago. But I think I found out that class att_rnn might have a tiny mistake in code ordering. If you see the class att_rnn's call part. you define s twice in a row then move onto c, which is a attention context.
your ordering is as below:
s = self.rnn(inputs = inputs, states = states) # s = m_{t}, [m_{t}, c_{t}] #m is memory(hidden) and c is carry(cell)
s = self.rnn2(inputs=s[0], states = s[1])[1] # s = m_{t+1}, c_{t+1}
c = self.attention_context([s[0], h])
but isn't it supposed to be as below?
s = self.rnn(inputs = inputs, states = states) # s = m_{t}, [m_{t}, c_{t}]
c = self.attention_context([s[0], h])
s = self.rnn2(inputs=s[0], states = s[1])[1] # s = m_{t+1}, c_{t+1}
As the original paper suggests, attention context vector at timestep t is made by applying attention to the s_t and h, where h is a result of pBLSTM. But I think by your way of ordering you are deriving attention context vector from s_{t+1} and h. Thank you for your great work.