How can I see approx_bleu on validation set?
Hello , I use T2T for translation task, it's version is 1.5.5.
The setting I use as follows: PROBLEM=translate_enzh_wmt32k MODEL=transformer HPARAMS=transformer_base_single_gpu
I used t2t-trainer.py to train a model. When it evaluate on validation set, it output the information: "loss = 8.52209, metrics-translate_enzh_wmt32k/neg_log_perplexity = -9.75649"
When I rerun t2t-trainer.py,It outputs the information about loss and accuracy(or any metrics else) when evaluate on validation set ,why?
It output the matrics randomly ? How can I see approx_bleu on validation set when evaluate ?
I am not sure what is your main question (problem).
Use tensorboard and possibly also t2t-bleu, see e.g. #587
@martinpopel
Hello,t2t-bleu use to get real bleu.
But approx_bleu is computed by bleu_score function in bleu_hook.py
Yes.
(Both t2t-bleu and approx_bleu use the same code in bleu_hook.py, but approx_bleu applies it on subwords instead of words and with cheating by looking at the previous word from the reference translation - i.e. not autoregressively, unless --eval_run_autoregressive is used.)
I used to train translation model with T2T(version is 1.0.14). When it began to evaluate on validation set,the log would output the information like this:INFO:tensorflow:Saving dict for global step 9704: global_step = 9704, loss = 4.03075, metrics-wmt_zhen_tokens_32k/accuracy = 0.40701, metrics-wmt_zhen_tokens_32k/accuracy_per_sequence = 0.0, metrics-wmt_zhen_tokens_32k/accuracy_top5 = 0.656632, metrics-wmt_zhen_tokens_32k/approx_bleu_score = 0.120866, metrics-wmt_zhen_tokens_32k/neg_log_perplexity = -3.29166, metrics/accuracy = 0.40701, metrics/accuracy_per_sequence = 0.0, metrics/accuracy_top5 = 0.656632, metrics/approx_bleu_score = 0.120866, metrics/neg_log_perplexity = -3.29166
But,when I use T2T(version is 1.5.5) the information output during evaluating like this:[2018-03-13 19:40:59,878] Saving dict for global step 72002: global_step = 72002, loss = 2.13578, metrics-translate_enzh_wmt32k/accuracy_per_sequence = 0.00976631
where is approx_bleu now?
I can confirm that approx_bleu is now missing.
@stefan-it It's a bug and I have fixed it now.
I can confirm that approx_bleu is shown in the latest version of tensor2tensor, so @jiangbojian could close this issue here :)
Hello, my problem is also enzh and i use transformer_base_single_gpu. However, i use my own dataset and the size of dataset is about 600W. My issue is approx_blue_score is about 20.x, but when i run t2t-bleu, i found the bleu-uncased and bleu-cased are both 5.x. I dont understand why there are huge different between approx_blue_score and bleu-uncased/bleu-uncased. Thank you~
t2t-bleu is not suitable for Chinese (as the target language). Use sacrebleu --tok zh instead. See https://github.com/awslabs/sockeye/tree/master/contrib/sacrebleu
Simple question - what script should we run to the get the approx_bleu for a given checkpoint (I'm asking because I want to compare the quality of a single checkpoint against an averaged checkpoint)