show_attend_and_tell.tensorflow
show_attend_and_tell.tensorflow copied to clipboard
A question about parameters update
In the following code, it seems that the parameter 'c' is never used. `
lstm_preactive = tf.matmul(h, self.lstm_U) + x_t + tf.matmul(weighted_context,self.image_encode_W)
i, f, o, new_c = tf.split(1, 4, lstm_preactive)
i = tf.nn.sigmoid(i)
f = tf.nn.sigmoid(f)
o = tf.nn.sigmoid(o)
new_c = tf.nn.tanh(new_c)
c = f * c + i * new_c
h = o * tf.nn.tanh(new_c)`
Why the parameter 'h' depends on 'new_c' rather than 'c'? In my opinion, i think the updating procedures should be c(t) = f(t) * c(t−1) + i(t) * new_c(t) h(t) = o(t) * tanh(c(t))
Yeah, I also think that the update should be h=otan(c) instead of h=otan(g)
Hello, I don't quite understand the meaning of x_t
. Could you give me some hints? Thank you!
Yes,the author in this package was wrong, @automan000 you are right!
I use the author's original model (didn't change the h(t) = o(t) * tanh(c(t)) )after 12 epoch ,the current loss only reduced to 2.96379992 ,is it right ?the loss is so big that the generated words only have one item that can not join into a sentence @Wind-Ward Could I ask how many epoch did you use to train a model that the result is satisfactory after change the mistake you point ? or without changing the mistake ,can I train a model that is satisfactory? I would appreciate it if you can give me apply.I am a student from China ,not being good at English,sorry if I don't express well.