dynamic-coattention-network-plus icon indicating copy to clipboard operation
dynamic-coattention-network-plus copied to clipboard

dcn_decode function

Open mehdimashayekhi opened this issue 6 years ago • 0 comments

Hi, quick question, don't we need to feed 'state' in decoder_body instead of 'output'? here in this line https://github.com/andrejonasson/dynamic-coattention-network-plus/blob/5182d91b2ff3707f9cafb308bf81f8bdd8bf5843/question_answering/networks/dcn_plus.py#L402

update: I run a toy experiment, output and state are the same here (I guess all cases they are the same except when the input has zero values. Usually I am used to feed final state as for example in seq2seq model except when we are outputting a label for each word, then we use output)

tf.reset_default_graph()
state_size=6
lstm_dec = tf.contrib.rnn.LSTMCell(num_units=state_size)
state = lstm_dec.zero_state(2, dtype=tf.float32)
encoding = tf.placeholder(dtype=tf.float32, shape=[None, 6])
output, state = lstm_dec(encoding, state)
X_batch = np.array(
  [[0, 1, 2, 9, 8, 7], 
  [3, 4, 5,0, 1, 3]])

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    outputs_val, states_val = sess.run([output, state], 
                                     feed_dict={encoding: X_batch})
    print('outputs:')
    print(sess.run(tf.shape(outputs_val)))
    print(outputs_val)
    print('\nstates:')
#     print(sess.run(tf.shape(states_val)))
    print(states_val)

outputs: [2 6] [[-0.00059593 0.00258833 -0.21033162 -0.00067049 -0.14312631 -0.31653395] [ 0.29717636 0.10479536 -0.04902191 0.02340557 0.0264852 0.41031399]]

states: LSTMStateTuple(c=array([[ -8.47961660e-03, 4.81678136e-02, -7.84035921e-01, -6.87910477e-04, -5.49614251e-01, -9.96938109e-01], [ 4.05349284e-01, 1.13499925e-01, -4.16956961e-01, 4.35479954e-02, 6.23860434e-02, 4.93930310e-01]], dtype=float32), h=array([[-0.00059593, 0.00258833, -0.21033162, -0.00067049, -0.14312631, -0.31653395], [ 0.29717636, 0.10479536, -0.04902191, 0.02340557, 0.0264852 , 0.41031399]], dtype=float32))

Note that 'h' and outputs are the same above. Also, note that in decoder_body shape of 'state' is N*H not [N, D, C] https://github.com/andrejonasson/dynamic-coattention-network-plus/blob/5182d91b2ff3707f9cafb308bf81f8bdd8bf5843/question_answering/networks/dcn_plus.py#L449

mehdimashayekhi avatar Aug 04 '18 23:08 mehdimashayekhi