attention-analysis
attention-analysis copied to clipboard
Bug in adding dummy word_repr for root
It should have been inserted at 0th index in -1 dimension, currently its added at last index. Since attention approximated for ROOT by adding start/end tokens would be at 0th index, it would expect word rep for root should also be at 0.
By fixing this bug I got around 3% higher UAS.
word_reprs = tf.concat([word_reprs, tf.zeros((n_words, 1, 200))], 1) # dummy for ROOT
Should be replaced with,
word_reprs = tf.concat([tf.zeros((n_words, 1, 200)), word_reprs], 1) # dummy for ROOT