understanding-ai icon indicating copy to clipboard operation
understanding-ai copied to clipboard

Convolution Sequence to Sequence Learning

Open flrngel opened this issue 6 years ago • 0 comments

Convolution Sequence to Sequence Learning

aka Fairseq

https://arxiv.org/pdf/1705.03122.pdf

3. A Convolutional Architecture

3.1. Position Embeddings

image P for position vector

image e for embedding

See also

"Positional encoding" from Attention is all you need

3.2. Convolutional Block Structure

image (image from https://norman3.github.io/papers/docs/fairseq.html)

For image above, kernel width is 3, and convolution block stack size is 1

3.3. Multi-step Attention

using residual connection from g_i

attention for dot product z and d_i

3.4. Normalization Strategy

This is good for stabilize learning

flrngel avatar Jan 29 '18 02:01 flrngel