Hiroshige Aoki
Results
2
issues of
Hiroshige Aoki
# Checklist: > [!IMPORTANT] > Please review the checklist below before submitting your pull request. - [x] Please open an issue before creating a PR or link to an existing...
🐞 bug
size:M
### Feature request Add an option to the RLOOTrainer that enables the use of string-based reward models, such as BLEU and Levenshtein distance, for evaluating model outputs. ### Motivation Currently,...
✨ enhancement
🏋 RLOO