alignment-handbook Add instrutions to evaluate on academic datasets

Add instrutions to evaluate on academic datasets

Open Randl opened this issue 1 year ago • 0 comments

The paper evaluates on ARC, HellaSwag, MMLU, and TruthfulQA, but this repo does not reference these evals. Adding short explanation regarding these evals (e.g., in https://github.com/huggingface/alignment-handbook/tree/main/scripts#evaluating-chat-models) would be nice

Dec 12 '23 10:12 Randl

alignment-handbook alignment-handbook copied to clipboard

Add instrutions to evaluate on academic datasets

alignment-handbook
alignment-handbook copied to clipboard