starcoder Is there a script for evaluating against eleutherAI’s language model evaluation harness?

Is there a script for evaluating against eleutherAI’s language model evaluation harness?

Open dh2shin opened this issue 2 years ago • 0 comments

trafficstars

Hello, I want to reproduce the lm evaluation harness results reported in the blog. Since the prompts need to be formatted with the user, assistant, system, end tokens, the evaluation harness does not work out of the box. I'm wondering if the team can share the script used to report the results in the table!

Jul 21 '23 00:07 dh2shin

starcoder starcoder copied to clipboard

Is there a script for evaluating against eleutherAI’s language model evaluation harness?

starcoder
starcoder copied to clipboard