rnn-tutorial-rnnlm icon indicating copy to clipboard operation
rnn-tutorial-rnnlm copied to clipboard

A simple question about the ouput o_t of rnn

Open UniqueAndys opened this issue 9 years ago • 3 comments

@dennybritz I have been reading your rnn code and I got an question in rnn_theano.py: 30--> def forward_prop_step(x_t, s_t_prev, U, V, W): s_t = T.tanh(U[:,x_t] + W.dot(s_t_prev)) o_t = T.nnet.softmax(V.dot(s_t)) return [o_t[0], s_t] It defines a simple recurrent network and my question is what's the component of o_t? Why using o_t[0]? Because when we use it, the x_t is each element of x which is a specific example of X_train, so the x is a list of real number which respect to the word_index and in this case x_t is a real number. So the o_t in this case is a 1-d vector whose length is the word_dim. Could you solve the problem for me? Thank you very much!

UniqueAndys avatar Mar 02 '16 13:03 UniqueAndys

It has been a while since I wrote this code, but I think it's just a result of how Theano applies the softmax. I believe Theano always returns a matrix, not a vector, so we are just converting the matrix with one row it into a vector (which has length word_dim, like you said).

dennybritz avatar Mar 02 '16 15:03 dennybritz

Thank you very much, I'll check it.

UniqueAndys avatar Mar 03 '16 02:03 UniqueAndys

@dennybritz hello, I also have a question here: def forward_prop_step(x_t, s_t_prev, U, V, W): s_t = T.tanh(U[:,x_t] + W.dot(s_t_prev)) o_t = T.nnet.softmax(V.dot(s_t)) return [o_t[0], s_t] As defined in your blog, the shape s_t is 100x1, and the V is 8000x100, so the V.dot(s_t) is 8000x1 and the o_t is 8000x1, why we use o_t[0], i'm confused about this,could you solve the problem for me? Thank you very much!

yyHaker avatar Nov 08 '17 04:11 yyHaker