llm-foundry
llm-foundry copied to clipboard
Verify icl cfgs
Run with amp_fp16:
| Benchmark | Subcategory | Accuracy | Number few shot | Model |
|---|---|---|---|---|
| jeopardy | Average | 0.279767 | 0 | mosaicml/mpt-7b |
| american_history | 0.365617 | 0 | mosaicml/mpt-7b | |
| literature | 0.318367 | 0 | mosaicml/mpt-7b | |
| science | 0.138655 | 0 | mosaicml/mpt-7b | |
| word_origins | 0.115068 | 0 | mosaicml/mpt-7b | |
| world_history | 0.461126 | 0 | mosaicml/mpt-7b | |
| lambada_openai | 0.70328 | 0 | mosaicml/mpt-7b | |
| piqa | 0.799238 | 0 | mosaicml/mpt-7b | |
| hellaswag | 0.761701 | 0 | mosaicml/mpt-7b | |
| arc_easy | 0.67298 | 0 | mosaicml/mpt-7b | |
| arc_challenge | 0.396758 | 0 | mosaicml/mpt-7b | |
| copa | 0.8 | 0 | mosaicml/mpt-7b | |
| boolq | 0.748012 | 0 | mosaicml/mpt-7b | |
| mmlu | Average | 0.293343 | 0 | mosaicml/mpt-7b |
| abstract_algebra | 0.32 | 0 | mosaicml/mpt-7b | |
| anatomy | 0.355556 | 0 | mosaicml/mpt-7b | |
| astronomy | 0.256579 | 0 | mosaicml/mpt-7b | |
| business_ethics | 0.28 | 0 | mosaicml/mpt-7b | |
| clinical_knowledge | 0.30566 | 0 | mosaicml/mpt-7b | |
| college_biology | 0.305556 | 0 | mosaicml/mpt-7b | |
| college_chemistry | 0.21 | 0 | mosaicml/mpt-7b | |
| college_computer_science | 0.27 | 0 | mosaicml/mpt-7b | |
| college_mathematics | 0.27 | 0 | mosaicml/mpt-7b | |
| college_medicine | 0.289017 | 0 | mosaicml/mpt-7b | |
| college_physics | 0.22549 | 0 | mosaicml/mpt-7b | |
| computer_security | 0.33 | 0 | mosaicml/mpt-7b | |
| conceptual_physics | 0.251064 | 0 | mosaicml/mpt-7b | |
| econometrics | 0.263158 | 0 | mosaicml/mpt-7b | |
| electrical_engineering | 0.310345 | 0 | mosaicml/mpt-7b | |
| elementary_mathematics | 0.304233 | 0 | mosaicml/mpt-7b | |
| formal_logic | 0.277778 | 0 | mosaicml/mpt-7b | |
| global_facts | 0.34 | 0 | mosaicml/mpt-7b | |
| high_school_biology | 0.306452 | 0 | mosaicml/mpt-7b | |
| high_school_chemistry | 0.285714 | 0 | mosaicml/mpt-7b | |
| high_school_computer_science | 0.33 | 0 | mosaicml/mpt-7b | |
| high_school_european_history | 0.254545 | 0 | mosaicml/mpt-7b | |
| high_school_geography | 0.333333 | 0 | mosaicml/mpt-7b | |
| high_school_government_and_politics | 0.321244 | 0 | mosaicml/mpt-7b | |
| high_school_macroeconomics | 0.266667 | 0 | mosaicml/mpt-7b | |
| high_school_mathematics | 0.240741 | 0 | mosaicml/mpt-7b | |
| high_school_microeconomics | 0.268908 | 0 | mosaicml/mpt-7b | |
| high_school_physics | 0.271523 | 0 | mosaicml/mpt-7b | |
| high_school_psychology | 0.286239 | 0 | mosaicml/mpt-7b | |
| high_school_statistics | 0.185185 | 0 | mosaicml/mpt-7b | |
| high_school_us_history | 0.308824 | 0 | mosaicml/mpt-7b | |
| high_school_world_history | 0.2827 | 0 | mosaicml/mpt-7b | |
| human_aging | 0.246637 | 0 | mosaicml/mpt-7b | |
| human_sexuality | 0.274809 | 0 | mosaicml/mpt-7b | |
| international_law | 0.413223 | 0 | mosaicml/mpt-7b | |
| jurisprudence | 0.324074 | 0 | mosaicml/mpt-7b | |
| logical_fallacies | 0.368098 | 0 | mosaicml/mpt-7b | |
| machine_learning | 0.267857 | 0 | mosaicml/mpt-7b | |
| management | 0.320388 | 0 | mosaicml/mpt-7b | |
| marketing | 0.324786 | 0 | mosaicml/mpt-7b | |
| medical_genetics | 0.26 | 0 | mosaicml/mpt-7b | |
| miscellaneous | 0.374202 | 0 | mosaicml/mpt-7b | |
| moral_disputes | 0.315029 | 0 | mosaicml/mpt-7b | |
| moral_scenarios | 0.248045 | 0 | mosaicml/mpt-7b | |
| nutrition | 0.310458 | 0 | mosaicml/mpt-7b | |
| philosophy | 0.33119 | 0 | mosaicml/mpt-7b | |
| prehistory | 0.345679 | 0 | mosaicml/mpt-7b | |
| professional_accounting | 0.276596 | 0 | mosaicml/mpt-7b | |
| professional_law | 0.290091 | 0 | mosaicml/mpt-7b | |
| professional_medicine | 0.205882 | 0 | mosaicml/mpt-7b | |
| professional_psychology | 0.289216 | 0 | mosaicml/mpt-7b | |
| public_relations | 0.272727 | 0 | mosaicml/mpt-7b | |
| security_studies | 0.236735 | 0 | mosaicml/mpt-7b | |
| sociology | 0.293532 | 0 | mosaicml/mpt-7b | |
| us_foreign_policy | 0.35 | 0 | mosaicml/mpt-7b | |
| virology | 0.277108 | 0 | mosaicml/mpt-7b | |
| world_religions | 0.397661 | 0 | mosaicml/mpt-7b | |
| winograd | 0.868132 | 0 | mosaicml/mpt-7b | |
| winogrande | 0.685083 | 0 | mosaicml/mpt-7b | |
| triviaqa | 0.343057 | 0 | mosaicml/mpt-7b |
Run with amp_bf16:
| Benchmark | Subcategory | Accuracy | Number few shot | Model |
|---|---|---|---|---|
| jeopardy | Average | 0.273737 | 0 | mosaicml/mpt-7b |
| american_history | 0.355932 | 0 | mosaicml/mpt-7b | |
| literature | 0.308163 | 0 | mosaicml/mpt-7b | |
| science | 0.136555 | 0 | mosaicml/mpt-7b | |
| word_origins | 0.109589 | 0 | mosaicml/mpt-7b | |
| world_history | 0.458445 | 0 | mosaicml/mpt-7b | |
| lambada_openai | 0.686202 | 0 | mosaicml/mpt-7b | |
| piqa | 0.799238 | 0 | mosaicml/mpt-7b | |
| hellaswag | 0.762199 | 0 | mosaicml/mpt-7b | |
| arc_easy | 0.673401 | 0 | mosaicml/mpt-7b | |
| arc_challenge | 0.391638 | 0 | mosaicml/mpt-7b | |
| copa | 0.8 | 0 | mosaicml/mpt-7b | |
| boolq | 0.739144 | 0 | mosaicml/mpt-7b | |
| mmlu | Average | 0.292015 | 0 | mosaicml/mpt-7b |
| abstract_algebra | 0.3 | 0 | mosaicml/mpt-7b | |
| anatomy | 0.407407 | 0 | mosaicml/mpt-7b | |
| astronomy | 0.269737 | 0 | mosaicml/mpt-7b | |
| business_ethics | 0.3 | 0 | mosaicml/mpt-7b | |
| clinical_knowledge | 0.309434 | 0 | mosaicml/mpt-7b | |
| college_biology | 0.319444 | 0 | mosaicml/mpt-7b | |
| college_chemistry | 0.2 | 0 | mosaicml/mpt-7b | |
| college_computer_science | 0.25 | 0 | mosaicml/mpt-7b | |
| college_mathematics | 0.25 | 0 | mosaicml/mpt-7b | |
| college_medicine | 0.294798 | 0 | mosaicml/mpt-7b | |
| college_physics | 0.215686 | 0 | mosaicml/mpt-7b | |
| computer_security | 0.34 | 0 | mosaicml/mpt-7b | |
| conceptual_physics | 0.238298 | 0 | mosaicml/mpt-7b | |
| econometrics | 0.263158 | 0 | mosaicml/mpt-7b | |
| electrical_engineering | 0.317241 | 0 | mosaicml/mpt-7b | |
| elementary_mathematics | 0.304233 | 0 | mosaicml/mpt-7b | |
| formal_logic | 0.293651 | 0 | mosaicml/mpt-7b | |
| global_facts | 0.34 | 0 | mosaicml/mpt-7b | |
| high_school_biology | 0.264516 | 0 | mosaicml/mpt-7b | |
| high_school_chemistry | 0.295566 | 0 | mosaicml/mpt-7b | |
| high_school_computer_science | 0.33 | 0 | mosaicml/mpt-7b | |
| high_school_european_history | 0.260606 | 0 | mosaicml/mpt-7b | |
| high_school_geography | 0.313131 | 0 | mosaicml/mpt-7b | |
| high_school_government_and_politics | 0.310881 | 0 | mosaicml/mpt-7b | |
| high_school_macroeconomics | 0.264103 | 0 | mosaicml/mpt-7b | |
| high_school_mathematics | 0.255556 | 0 | mosaicml/mpt-7b | |
| high_school_microeconomics | 0.273109 | 0 | mosaicml/mpt-7b | |
| high_school_physics | 0.264901 | 0 | mosaicml/mpt-7b | |
| high_school_psychology | 0.26055 | 0 | mosaicml/mpt-7b | |
| high_school_statistics | 0.212963 | 0 | mosaicml/mpt-7b | |
| high_school_us_history | 0.269608 | 0 | mosaicml/mpt-7b | |
| high_school_world_history | 0.291139 | 0 | mosaicml/mpt-7b | |
| human_aging | 0.273543 | 0 | mosaicml/mpt-7b | |
| human_sexuality | 0.290076 | 0 | mosaicml/mpt-7b | |
| international_law | 0.363636 | 0 | mosaicml/mpt-7b | |
| jurisprudence | 0.351852 | 0 | mosaicml/mpt-7b | |
| logical_fallacies | 0.300613 | 0 | mosaicml/mpt-7b | |
| machine_learning | 0.285714 | 0 | mosaicml/mpt-7b | |
| management | 0.300971 | 0 | mosaicml/mpt-7b | |
| marketing | 0.376068 | 0 | mosaicml/mpt-7b | |
| medical_genetics | 0.32 | 0 | mosaicml/mpt-7b | |
| miscellaneous | 0.355045 | 0 | mosaicml/mpt-7b | |
| moral_disputes | 0.320809 | 0 | mosaicml/mpt-7b | |
| moral_scenarios | 0.240223 | 0 | mosaicml/mpt-7b | |
| nutrition | 0.300654 | 0 | mosaicml/mpt-7b | |
| philosophy | 0.315113 | 0 | mosaicml/mpt-7b | |
| prehistory | 0.345679 | 0 | mosaicml/mpt-7b | |
| professional_accounting | 0.27305 | 0 | mosaicml/mpt-7b | |
| professional_law | 0.260104 | 0 | mosaicml/mpt-7b | |
| professional_medicine | 0.224265 | 0 | mosaicml/mpt-7b | |
| professional_psychology | 0.295752 | 0 | mosaicml/mpt-7b | |
| public_relations | 0.3 | 0 | mosaicml/mpt-7b | |
| security_studies | 0.2 | 0 | mosaicml/mpt-7b | |
| sociology | 0.283582 | 0 | mosaicml/mpt-7b | |
| us_foreign_policy | 0.32 | 0 | mosaicml/mpt-7b | |
| virology | 0.259036 | 0 | mosaicml/mpt-7b | |
| world_religions | 0.409357 | 0 | mosaicml/mpt-7b | |
| winograd | 0.868132 | 0 | mosaicml/mpt-7b | |
| winogrande | 0.685872 | 0 | mosaicml/mpt-7b | |
| triviaqa | 0.336781 | 0 | mosaicml/mpt-7b |