vkramgovind comments

Repositories
Issues
Comments

Results 4 comments of


                                            vkramgovind

Training speed of Transformer-Big

Thanks for the quick reply. Will close the issue then.

Training speed of Transformer-Big

Hi , I tried with configuration suggested. Still BLEU seems to be stuck around 2.2 after 3rd Epoch. Transformer base seems to work well for Single GPU. Any other pointers...

Training speed of Transformer-Big

Hi.. I have attached train.log and valid.log . Please let me know if there is anything else you need [train.log](https://github.com/marian-nmt/marian/files/2652042/train.log) [valid.log](https://github.com/marian-nmt/marian/files/2652043/valid.log)

Training speed of Transformer-Big

Data is from the example only..average-attention was a little faster in inference compared to default in case of Transformer base. So I straight away tried average-attention for the Transformer-big