seq2seq icon indicating copy to clipboard operation
seq2seq copied to clipboard

How to extract the "sequence vector"

Open tsaiian opened this issue 8 years ago • 4 comments

Hi all,

Following "Semi-supervised Sequence Learning", i want to extract the "sequence vector" (i.e. the output of LSTM encoder) of an autoencoder, but i don't understand the shape of encoder output:

_maxlen = 10
_nb_words = 2000
_word_emb_dim = 50

embedder = Embedding(input_dim=nb_words, output_dim=_word_emb_dim)
seq2seq = Seq2Seq(
    input_shape=(_maxlen, _word_emb_dim),
    output_length=_maxlen,
    output_dim=_word_emb_dim,
    depth=4)

# encoder
inputs = Input(shape=(_maxlen, ))
embed = embedder(inputs)
seq_vec = seq2seq.encoder(embed)

encoder = Model(inputs, seq_vec )

## print the shape of output:
print encoder.output_shape
# output: [(None, 50), None, None]

result = encoder.predict(np.array([[0,0,0,0,0,1,2,3,4,5],[0,0,0,0,0,5,4,3,2,1]]))
print np.array(result).shape
# output: (3, 2, 50)

My question is why the output shape is (3, number of instances, _word_emb_dim ), the LSTM depth is 4, so it should generate 5 vector (the red line below) ? seq2seq

I will be grateful for any help you can provide.

tsaiian avatar Dec 16 '16 06:12 tsaiian

I'd like to "bump" this question. I actually don't think the len=3 has anything to do with the depth because I have a seq2seq with depth=1 which also returns an output of len=3.

Anyone with any insight into what the 3 items are in the output?

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_135 (InputLayer)           (None, 5)             0                                            
____________________________________________________________________________________________________
embedding_layer (Embedding)      (None, 5, 300)        3600        input_135[0][0]                  
____________________________________________________________________________________________________
private__optional_input_place_ho (2,)                  0                                            
____________________________________________________________________________________________________
private__optional_input_place_ho (2,)                  0                                            
____________________________________________________________________________________________________
private__optional_input_place_ho (2,)                  0                                            
____________________________________________________________________________________________________
recurrent_sequential_7 (Recurren [(None, 300), (None,  721200      embedding_layer[1][0]            
                                                                   private__optional_input_place_hol
                                                                   private__optional_input_place_hol
                                                                   private__optional_input_place_hol
====================================================================================================
Total params: 724,800
Trainable params: 721,200
Non-trainable params: 3,600

michaelcapizzi avatar Jun 30 '17 16:06 michaelcapizzi

I have successfully compiled a model that gets around the issue of a multiple-length output described above.

I will show it below, but a caveat: I am not confident that the step I took is the correct one! I'll explain my doubts below as well.

The seq2seq.encoder from the code sample by @tsaiian, has an output length of 3. When I get information about those 3 outputs, I see this:

[<tf.Tensor 'recurrent_sequential_1_1/TensorArrayReadV3:0' shape=(?, 300) dtype=float32>, <tf.Tensor 'recurrent_sequential_1_1/while/Exit_2:0' shape=(?, 300) dtype=float32>, <tf.Tensor 'recurrent_sequential_1_1/while/Exit_3:0' shape=(?, 300) dtype=float32>]

So it appears as if the output is three different "versions" of the same RNN output. My simple solution was simply to utilize the first output only. So the encoder by @tsaiian now becomes:

# encoder
encoder = Model(inputs, seq_vec[0])   # only use the *first* output of the encoder

Note: There is also a Dense layer between the encoder and decoder in seq2seq, and I tried adding that to MY encoder, but in a very simple experiment, it seemed to wash out the distinctions between sentences. Two very different sentences still had a cosine similarity of upper 80s. Whereas without it, those same two sentences had a similarity of only around 50.

So I'm hoping this information might get all of us closer to building a true standalone encoder. And perhaps @farizrahman4u has some insight into why the encoder portion of seq2seq has an output of length 3.

michaelcapizzi avatar Jul 10 '17 21:07 michaelcapizzi

I know this is an old post and this might be a silly question, but how did you go on compiling the standalone encoder?

Thanks in advance.

Kind regards, Theodore.

TheodoreGalanos avatar May 19 '18 08:05 TheodoreGalanos

I'm not sure I understand your question. But in my last comment I showed what I think is the encoder-only portion of the model based on the original poster's code.

# encoder
encoder = Model(inputs, seq_vec[0])   # only use the *first* output of the encoder

michaelcapizzi avatar May 31 '18 21:05 michaelcapizzi