video_to_sequence Question about n_lstm

Question about n_lstm_steps

Open Yugnaynehc opened this issue 7 years ago • 0 comments

Hi! Thank you for your excellent work! I learn a lot from your implementation. But there is a little question about n_lstm_steps. You make the encode and decode stage have the same n_lstm_steps, so each stage will expand 80 time steps. However, the original paper says that

"... we unroll the LSTM to a fixed 80 time steps during training. ... to ensure that the sum of the number of frames and words is within this limit."

So I am curious about the difference between this two strategy will change the model heavily or not?

Besides, I am also want to know using zeros to pad the video frame will introduce some error or not? In my opinion, the zero-pad video frames will affect the hidden state of encode LSTM, so the encoded information will include some noises. Is it right?

Thanks!

Apr 05 '17 13:04 Yugnaynehc

video_to_sequence video_to_sequence copied to clipboard

Question about n_lstm_steps

video_to_sequence
video_to_sequence copied to clipboard