Support for a tokenizer parameter in metrics
Some metrics, such as Rouge, accept a tokenizer parameter for better support for foreign languages. It will be helpful to expose this option.
https://discuss.huggingface.co/t/which-tokenizer-does-rouge-metric-uses-under-the-hood/19903
https://github.com/google-research/google-research/blob/e3d00617cb28064b6e96ab4e2485079f0ca5a763/rouge/rouge_scorer.py#L60
cc: @perlitz @yoavkatz @gitMichal
i also came across this implementation from the authors of xlsum:
https://github.com/csebuetnlp/xl-sum/tree/master/multilingual_rouge_scoring
also in the meeting with Hans' team, they said that we can use the rouge as is (with the tokenizer), no need for stemming. results will be lower, but we only care about comparison (and not absolute values), so it should be fine