Felipe Maia Polo
Felipe Maia Polo
Dear authors, Thank you for the great work. I wanted to finetune UniEval on other datasets, let's say XSum (instead of CNN/DailyMail). What do you think the best way is...
This PR introduces a new `--examples` argument to the evaluation pipeline in `lm-evaluation-harness`, enabling users to evaluate specific examples across multiple tasks. This enhancement extends the functionality of the `--limit`...
Hello, Do you make model trajectories (and their interactions with the system available)? I couldn't find them. Thank you!