instruct-eval
instruct-eval copied to clipboard
Add zero-shot evaluation results
Hi all, I read the code and realized that the results were obtained from 3-shot demonstrations. However, some models were trained to follow instructions without demonstrations. These models may have better relative zero-shot performance (ranked higher in the result table) than current few-shot setting. It will be great if you can add these zero-shot evaluation results. Thank you.
Good point, it may be worth investigating the zero-shot performance as well. We will try and add the zero-shot results for MMLU in the next few weeks. For now, we have reported the zero-shot results of HumanEval in the readme table (last column)