Alham Fikri Aji comments

Results 6 comments of


                                            Alham Fikri Aji

quantize setting as the doc said but lead to "skipping *-th update due to loss being nan" for all train data input

Are you by any chance, training a quantized model from scratch? One option is to train a normal model first, then activate the quantization. alternatively, not using --quantize-biases true should...

Create dataset loader for AM2iCo

#self-assign

Closes #227 Data loader for Karonese sentiment

This dataset is a bit noisy at the moment, aside from having inconsistent labeling (numeric vs string), some data has no labels at all. I've sent a PR to that...

Flexible learning rate/beta specification

Some experiments: Originally we cannot train transformer on async SGD (0.0 BLEU). But if we assume that the average words per batch in sync SGD is 4x larger compared to...

Flexible learning rate/beta specification

**What is the right way to specify this on the command line** Currently we can set --batch-normal-words. I think the easiest way both for us and users is just scale...

Reduce number of Marian branches

will do it this week... so merge quantized training to master or to nick's branch?