Guo Sheng

Results 19 issues of Guo Sheng

Add ParallelExecutor for training, validation and saving in Transformer.

Tune the Transformer model for wmt14

Make the Transformer network configurations more flexible

Refine the Transformer network to be more flexible and with higher BLEU

Add fluid_transformer.md

Fix text decoding in Transformer under python3.

Rename neural_machine_translation as PaddleMT.

Update Transformer details

Add validation for dygraph Transformer. Add cross-attention cache for dygraph Transformer. Add greedy search for dygraph Transformer. 此外若要与 T2T 一致,还请参照 #3684 中的内容进行细节上的更新。主要是position encoding、scaled_dot_product_attention中的scale