attention-networks-for-classification icon indicating copy to clipboard operation
attention-networks-for-classification copied to clipboard

Init hidden state for the 2nd sentence onward

Open smutahoang opened this issue 6 years ago • 2 comments

Hi,

Thanks for sharing your implementation. This helps me a lot.

I just wonder the way you initialize the hidden state for the question second question onward. Precisely, in the "def train_data(mini_batch, targets, word_attn_model, sent_attn_model, word_optimizer, sent_optimizer, criterion):" function (in the "attention_model_validation_experiments" notebook), you currently use a loop over the sentence: "_s, state_word, _ = word_attn_model(mini_batch[i,:,:].transpose(0,1), state_word)". That means, both "the forward and backward states of the last word in the sentence i" are used for initializing the forward and backward states of sentence i+1. I can understand the case for forward state as the two sentence are consecutive, but the backward state initialization seems not very reasonable.

Can you please explain this in more detail? Thanks.

smutahoang avatar Jan 25 '19 11:01 smutahoang

Hi,

I'm sorry that I don't understand what you are asking for. You will have to initialise both the forward and the backward states initially to start the training process.

Please revert back to me with a bit more clarity so that I will be able to help you out.

Thanks.

Sandeep42 avatar Jan 31 '19 07:01 Sandeep42

Lets use last_h_S = (last_h_forward, last_h_backward) to denote the hidden states of the last word in sentence number S, and use init_h_[S+1] to denote the init hidden states of the sentence number S + 1.

From the code, I understand that you assign init_h_[S+1] = last_h_S = (last_h_forward, last_h_backward) (am I right?). Should it be more reasonable to set init_h_[S+1] = (last_h_forward, 0) ?

smutahoang avatar Feb 07 '19 14:02 smutahoang