text_classification icon indicating copy to clipboard operation
text_classification copied to clipboard

HAN的attention里为什么加reduce_sum和reduce_max?

Open fengdoudou1895 opened this issue 4 years ago • 1 comments

在HAN的attention里面看到: attetion_logits = tf.reduce_sum(hidden_state_context_similarity,axis = 2) attention_logits_max = tf.reduce_max(attention_logits, axis = 1,keep_dims = True) p_attention = tf.nn.softmax(attetion_logits-attention_logits_max) 原论文里没看到这个操作,请问这是为什么呢?

fengdoudou1895 avatar Apr 21 '20 12:04 fengdoudou1895

This is because of the fact that softmax is shift-invariant by a constant offset in the input.

softmax is invariant under translation by the same value in each coordinate See wikipedia and a StackOverflow answer.

Deducing the maximum value in the attention_logits allows a faster and more stable numerical computation.

acadTags avatar May 20 '20 12:05 acadTags