Transformer-backbone

This is the reproduce of Transformer architecture in paper "Attention is all your need". The aim of this repository is to help those who want an insight to the details of Transformer realization, without being bothered with data preprocessing.
The structure of Transformer is illustrated as bellow

Thus, we build the network hierarchically. From the top to bottom level is

Transformer--Fused_Embedding Encoder Decoder--Encoder_layer Decoder_layer--Multiheaded Attention PositionWise_FeedForwardNetwork

the tree structure is shown as bellow:

-Transformer.py

--Fus_Embeddings(AggregationModel.py)

-- word Embedding Vectors
-- Positional Encoding(Modules.py)

--Encoder(AggregationModel.py)

-- Encoder Layer(Model.py)
-- MultiHeadedAttention(Modules.py)
-- PostionWiseFFN(Modules.py)

--Decoder(AggregationModel.py)

-- Decoder Layer(Model.py)
-- MultiHeadedAttention(Modules.py)
-- PostionWiseFFN(Modules.py)

Environment Configuration

pytorch 1.1.0
python 3.6.8
torchtext 0.5.0
tqdm
dill

Usage

WMT'17 Multimodal Translation: de-en BPE

The byte-pair-encoding has already been processed so that you can focus on the specific structure of Transformer
Train the model
python train.py -data_pkl ./bpe_deen/bpe_vocab.pkl -train_path ./bpe_deen/deen-train -val_path ./bpe_deen/deen-val -log deen_bpe -label_smoothing -save_model trained -b 256 -warmup 128000 -epoch 400
GPU requirement: 4 TitanX

Performance

Loss Accuracy

Acknowledgement

The data interface is borrowed from "A PyTorch implementation of the Transformer model in "Attention is All You Need"."
Another outstanding work "The Annotated Transformer" inspired me during my coding process

Transformer-backbone
Transformer-backbone copied to clipboard

Metadata

Transformer-backbone

Environment Configuration

Usage

WMT'17 Multimodal Translation: de-en BPE

Performance

Loss Accuracy

Acknowledgement

← Metadata

Owner

Metadata

Transformer-backbone Transformer-backbone copied to clipboard

Metadata

Transformer-backbone

Environment Configuration

Usage

WMT'17 Multimodal Translation: de-en BPE

Performance

Loss Accuracy

Acknowledgement

← Metadata

Owner

Metadata

Transformer-backbone
Transformer-backbone copied to clipboard