llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

[Evals API][8/n] AnswerParsingScoringFn for MMLU

Open yanxi0830 opened this issue 1 year ago • 0 comments

--continuation of https://github.com/meta-llama/llama-stack/pull/333

TL;DR

  1. Introduce a registerable AnswerParsingScoringFn with AnswerParsingScoringContext for registering scoring functions with context
  2. Remove parameters field (context alone is sufficient for things related to scoring context)
  3. AnswerParsingScoringFn is able to run all benchmarks with multiple choice tasks (see apps PR).

Test

PROVIDER_ID=test-meta PROVIDER_CONFIG=llama_stack/providers/tests/scoring/provider_config_example.yaml pytest -s llama_stack/providers/tests/scoring/test_scoring.py --tb=short --disable-warnings

App

  • See App: https://github.com/meta-llama/llama-stack-apps/pull/105

yanxi0830 avatar Oct 31 '24 23:10 yanxi0830