seq2seq
seq2seq copied to clipboard
Deprecate non-standard BLEU scripts
multi-bleu.perl has been deprecated for years now because it encourages people to use non-standard tokenization. This repository contains another non-standard BLEU implementation that a user might not notice they are using: https://github.com/google/seq2seq/blob/7f485894d412e8d81ce0e07977831865e44309ce/seq2seq/metrics/bleu.py
A paper by @mjpost https://www.aclweb.org/anthology/W18-6319/ shows how much this can vary.
Please put fat warnings on non-standard BLEU scripts that they are not appropriate for publication, as Moses has. And remove multi-bleu.perl from examples.