training_policies icon indicating copy to clipboard operation
training_policies copied to clipboard

Transformer Quality Target Change

Open bitfort opened this issue 6 years ago • 7 comments

Note to follow up about the current transformer quality target (25->27?).

bitfort avatar Jan 17 '19 19:01 bitfort

SWG Notes:

We intend to move to the quality target to 27. There is an AI to modify (and confirm) the reference reaches the target.

bitfort avatar Jan 17 '19 19:01 bitfort

SWG Notes:

AI(Cray) - Check target quality on english to french and english to german. Related to: https://github.com/mlperf/policies/issues/175

bitfort avatar Jan 24 '19 19:01 bitfort

SWG Notes:

(English to german) Published accuracy is 28.4; not able to hit 27 at the reference batch size yet; continuing parameter searching here. We expect reference to hit 27, but with changes to learning rate / batch size.

(English to german) Google believes 27 can be hit at ~64k tokens global batch size. Above this, haven't been able to converge; but still exploring. Roughly doubles # of epochs versus 25.

(English to french) published accuracy is 43... Google has seen around 41, but on going investigation.

Continuing Cray AI. AI(Google) Explore english to french at scale (non-reference).

bitfort avatar Jan 31 '19 19:01 bitfort

SWG Notes:

We feel that variance is a concern here, especially at a target of 27. We'd like to increase accuracy, but want more information on variance to set the target.

AI(Cray & Google & CISCO) -- Do a some runs to 26 to look at variance (and provide data for 25.5 too).

bitfort avatar Mar 14 '19 18:03 bitfort

I was able to get 8x transformer reference runs in and saw convergence to 26.0 on Eng-to-Germ within 5 epochs for 5/8 runs, and within 6 epochs for remaining 3.

Here is the relevant grep from the logs:

grep "Bleu score (uncased)" mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_*new/translation/logfile | grep ": 26" mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_0_new/translation/logfile:Bleu score (uncased): 26.452380418777466 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_1_new/translation/logfile:Bleu score (uncased): 26.39443278312683 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_2_new/translation/logfile:Bleu score (uncased): 26.0280579328537 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_3_new/translation/logfile:Bleu score (uncased): 26.264476776123047 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_4_new/translation/logfile:Bleu score (uncased): 26.29130184650421 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_5_new/translation/logfile:Bleu score (uncased): 26.16676688194275 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_6_new/translation/logfile:Bleu score (uncased): 26.01703405380249 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_7_new/translation/logfile:Bleu score (uncased): 26.256629824638367

mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_0_new/translation/logfile:Starting iteration 5 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_1_new/translation/logfile:Starting iteration 6 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_2_new/translation/logfile:Starting iteration 6 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_3_new/translation/logfile:Starting iteration 5 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_4_new/translation/logfile:Starting iteration 5 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_5_new/translation/logfile:Starting iteration 5 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_6_new/translation/logfile:Starting iteration 5 mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_7_new/translation/logfile:Starting iteration 6

jbalma avatar Mar 28 '19 17:03 jbalma

SWG Notes:

No change to target accuracy for v0.6. We think for v0.7 we can move to target quality of 27 given more time to work on the issue.

bitfort avatar Apr 11 '19 18:04 bitfort

Active, moving to backlog.

petermattson avatar May 29 '20 20:05 petermattson