Leonid Sinev
Leonid Sinev
Mentioning issues with some sort of chat templating requests, hoping subscribed ones to take a peek (and maybe test) at this pull request https://github.com/EleutherAI/lm-evaluation-harness/issues/1098 https://github.com/EleutherAI/lm-evaluation-harness/issues/1209 https://github.com/EleutherAI/lm-evaluation-harness/issues/1490
By the way, you can take a peek at previous attempt of some sort of chat templating PR: https://github.com/EleutherAI/lm-evaluation-harness/pull/1287 Just in case of any pitfalls discussed or mentioned
https://github.com/EleutherAI/lm-evaluation-harness/issues/1560#issuecomment-1999204933 this detailed comment may be interesting to readers here too
> there are going to be more custom templates > > it is important to apply the template for that use case So, the template name shouldn't be fixed in...
Thank you for your efforts! Great table with results to compare! > Where did the difference come from? Please check other issues/discussions about speed, batches and multiple GPU usage for...
> Is this also an expected result? No idea. According to your results from table it is also task dependent issue. You may want to further research this case with...
Thanks for your help. I will try to use the solution described while experimenting with moving custom python Task to yaml form using ConfigurableTask. Not sure about time frame of...
Just connecting other issue (not sure if it is the only one) about inference time report: https://github.com/EleutherAI/lm-evaluation-harness/issues/1236
Made it backward compatible. Added seeds report in results file. Also updated code to be compatible with the main branch for ease of merging. Check this out, please, @haileyschoelkopf
@djstrong What do you think of this suggested workaround with logit_bias? https://github.com/EleutherAI/lm-evaluation-harness/issues/1196#issuecomment-1948246171