evaluate
evaluate copied to clipboard
the difference of your bleu and sacrebleu
What is the difference between your package's bleu implementation and sacrebleu implementation? I calculated the result differently in the two ways, Chinese expected, passed sacrebleu's zh tokenizer