neuralmonkey icon indicating copy to clipboard operation
neuralmonkey copied to clipboard

How to train a transformer model with multi-source encoders ?

Open penny9287 opened this issue 5 years ago • 3 comments

I wonder how to modify the configuration file to train a multi-source based transformer model with different attention types.

penny9287 avatar Jun 19 '19 05:06 penny9287

Hi, currently, the Transformer decoder only supports the multi-head scaled dot-product attention from the "Attention is All You Need" paper. If you provide multiple encoders, you can choose which attention combination strategy you want to use, one of serial, parallel, hierarchical, and flat.

jindrahelcl avatar Jun 19 '19 21:06 jindrahelcl

I wonder how to specify the combination strategy for multiple encoders in the configuration file, have any examples?

wyjllm avatar Jun 20 '19 03:06 wyjllm

just specify the attention_combination_strategy parameter in the transformer decoder configuration. It can be one of serial, parallel, hierarchical, and flat.

jindrahelcl avatar Jun 20 '19 22:06 jindrahelcl