Easy-Transformer icon indicating copy to clipboard operation
Easy-Transformer copied to clipboard

Add helper function to run HuggingFace evals on HookedTransformer

Open neelnanda-io opened this issue 2 years ago • 3 comments

Pick some example evals from here (eg PIQA, TriviaQA, LAMBADA) and write code to run HookedTransformer on them: https://huggingface.co/docs/evaluate/index

A demo notebook doing this for specific benchmarks would be a good MVP, bonus is doing a generic function for any eval (or eg for multiple choice evals, vs other types).

neelnanda-io avatar Dec 19 '22 12:12 neelnanda-io

I think https://github.com/EleutherAI/lm-evaluation-harness would be a good place to start here. Anyone doing this should be aware that this is going to be refactored: https://www.youtube.com/watch?v=6qDOUeQTp0I so should probably chat to Eleuther people

ArthurConmy avatar Mar 17 '23 13:03 ArthurConmy

@ArthurConmy Do you know if that refactoring has been done? could someone have a go at this now?

jbloomAus avatar Mar 27 '23 07:03 jbloomAus

I don't think the refactor is done.

I guess HF Evals are different fromt the way lm-eval-harness downloads datasets, maybe we should do this?

All three datasets mentioned are in lm-eval-harness though. I propose that we add lm-eval-harness as an optional additional dependency (e.g so pip install transformer-lens[evals] installs it) and we have a way to pass a HookedTransformers to eval-harness (currently it only support HF AutoModelForCausalLMs I think)

ArthurConmy avatar Mar 27 '23 08:03 ArthurConmy