MPQG icon indicating copy to clipboard operation
MPQG copied to clipboard

encoder state in `matching_encoder_utils.py`

Open pjlintw opened this issue 5 years ago • 4 comments

I read the paper "Leveraging Context Information for Natural Question Generation".

Section 2.2 says:

Each encoder state hj is the concatenation of two bi-directional LSTM states

(Section 3.2 also claim that the proposed decoder take the concatenation from BiLSTM. Same as in section 2.2 )



But in matching_encoder_utils.py, line 317 is showing that the encoder state concatenate aggregation_representation and in_passage_repres, which is output from filter layer (line 181). not the output of BiLSTM.

encode_hiddens = tf.concat([aggregation_representation, in_passage_repres], 2)



According to paper, the encoder hidden should be a representation of concatenated aggregation representation with cur_in_passage_repres (line 245), right?

Do I understand correctly? I am figturing out the difference. Anyone else can help?

pjlintw avatar May 26 '19 07:05 pjlintw

Hi @pjlintw

Section 3.2 clearly states that the attention memory in our proposed model has BOTH the Bi-LSTM encoding states AND the multi-perspective matching states. Please take a look at Equations 3 and 4.

freesunshine0316 avatar May 27 '19 21:05 freesunshine0316

The decoder is identical to the one described
in Section 2.2, **except** that matching information
is added to the attention memory:

freesunshine0316 avatar May 27 '19 22:05 freesunshine0316

@freesunshine0316 Thank you for pointing out!

My question is: Why the matching encoder output concatenated with in_passage_repres , instead of cur_in_passage_repres?

here: encode_hiddens = tf.concat([aggregation_representation, in_passage_repres], 2)

Because equation (3) states that passage states are from bi-directional LSTM. When I read code, I was wondering it suppose to be cur_in_passage_repres in line 245, as it were calculated from Bi-LSTM, but in_passage_repres don't.

pjlintw avatar May 28 '19 03:05 pjlintw

It's been a long time... It looks like we should use cur_in_passage_repres, which may further increase the performance.

freesunshine0316 avatar May 29 '19 01:05 freesunshine0316