understanding-ai icon indicating copy to clipboard operation
understanding-ai copied to clipboard

Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Open flrngel opened this issue 6 years ago • 0 comments

https://arxiv.org/abs/1709.04696

this is primitive version of Bi-BloSAN(see more from #2)

1. Introduction

  • Multi Head Attention: Attention Layer that can have benefit of parallel computign
  • Paper points that old attention NLP tasks are designed for Seq2Seq
  • Positional encoding makes temporal order more important

Features of DiSAN

  • Context aware
  • Parallel computing
  • Directional Self-Attention
  • Few parameters than RNN, transformer, etc

w.r.t means with regard to

3. Background

Self-Attention: using same x as i and j

3.1 Multi-dimensional Attention

  • Multi dimension makes context aware

3.3 Directional Self-Attention

  1. x->b
  2. token2token
  3. process partly
  4. masking

-inf makes softmax zero

flrngel avatar Feb 15 '18 11:02 flrngel