Vincent Nguyen

Results 123 comments of Vincent Nguyen

If you want to be sure change the exit condition here: https://github.com/OpenNMT/OpenNMT-py/blob/master/onmt/translate/beam_search.py#L192 replace self.beam_size by self.n_best In theory, with the new condition, you should have slightly better scores.

There should not be a convergence issue. Maybe the best is to submit a PR and we can have a look at your code. PS: just after the Transformer paper,...

at first sight looks good. can you give more info on your training / results ?

Just tested pytorch 1.13 optim.Adam( , fused=True) it is slower than fused=False. (test done with a large transformer training) diff is about 5% slower.

I am seconding this. It would be great to implement Bert-like models with encoders only + classification head. More specifically if we can use pre-trained parser like this: https://ufal.mff.cuni.cz/udpipe/2/models it...

Like Marian, we have a mechanism to average models on the fly (average_decay option) that's why we don't save models so often, but I am curious to understand why the...

okay in fact there is an issue with the valid batch size (8 tokens when batch_type=tokens, wheras it used to be 8 sentences in the past - we need to...

Michael, When looking again at the config there is still a discrepancy that can justify the BLEU difference. When you set rsqrt in ONMT there is no linear increase from...

the wall time looks still high, were you able to run with a batch size of 5000 update 10 ? did you keep the log ?

Great thanks. It seems that Torchscript brings the 5-10% improvement on 1 GPU, but I am unsure about the big gap on 8 GPUs. We use torch.distributed as well so...