mobile_app_open
mobile_app_open copied to clipboard
LLM TinyMMLU Dataset Implementation
As described in the parent issue, implementation of TinyMMLU is quite complete. The only things I think might need to be done are:
- ~Load data sample by sample as needed instead of all at once.~ (not necessary)
- ~Save output tokens in processing step and do all accuracy related calculations on the accuracy side. (needs discussion)~
- ~handle instances where no answer letter is provided. (likely fail query)~ (currently being done implicitly)
- ~properly build zero-shot prompt rather than cutting
input_formatted~