seq2seq icon indicating copy to clipboard operation
seq2seq copied to clipboard

Extracting hidden state of seq2seq model?

Open jasonzliang opened this issue 7 years ago • 3 comments

Given a trained Seq2Seq (with 1 encoder and 1 decoder layer) model and an input sequence, how can I get the hidden state/vector of the encoder layer immediately after feeding in the input sequence to the encoder layer?

In particular I am trying to reproduce this paper: https://openreview.net/pdf?id=BkSqjHqxg, where they train a seq2seq model and then use the hidden state as an embedding for the input sequence.

Thanks!

jasonzliang avatar Jul 11 '17 06:07 jasonzliang

Please refer to https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer

inzaghi250 avatar Jul 14 '17 05:07 inzaghi250

Here is a summary of what my model looks like:

> ___________________________________________________________________________________________________
> Layer (type)                     Output Shape          Param #     Connected to                     
> ====================================================================================================
> input_38 (InputLayer)            (1, 5, 5)             0                                            
> ____________________________________________________________________________________________________
> time_distributed_2 (TimeDistribu (1, 5, 128)           768         input_38[0][0]                   
> ____________________________________________________________________________________________________
> private__optional_input_place_ho (2,)                  0                                            
> ____________________________________________________________________________________________________
> private__optional_input_place_ho (2,)                  0                                            
> ____________________________________________________________________________________________________
> private__optional_input_place_ho (2,)                  0                                            
> ____________________________________________________________________________________________________
> recurrent_sequential_3 (Recurren [(1, 128), (1, 128),  131584      time_distributed_2[0][0]         
>                                                                    private__optional_input_place_hol
>                                                                    private__optional_input_place_hol
>                                                                    private__optional_input_place_hol
> ____________________________________________________________________________________________________
> dense_21 (Dense)                 (1, 5)                645         recurrent_sequential_3[0][0]     
> ____________________________________________________________________________________________________
> recurrent_sequential_4 (Recurren (1, 5, 5)             69253       dense_21[0][0]                   
>                                                                    recurrent_sequential_3[0][1]     
>                                                                    recurrent_sequential_3[0][2]     
>                                                                    dense_21[0][0]                   
> ====================================================================================================
> Total params: 202,250
> Trainable params: 202,250
> Non-trainable params: 0
> ______________________________________________

Which layer would be the encoder layer?

jasonzliang avatar Jul 14 '17 08:07 jasonzliang

The dense_21 in this case. You can give a custom name to the encoding layer making it easier to find.

uniaz avatar Jul 21 '17 17:07 uniaz