firefox-translations-training
firefox-translations-training copied to clipboard
Investigate using LLMs for evaluation
It would be interesting to compare evaluation capabilities of LLMs to COMET and human evaluation.
See the paper: Large Language Models Are State-of-the-Art Evaluators of Translation Quality