llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

[Evals API][3/n] scoring_functions / scoring meta-reference implementations

Open yanxi0830 opened this issue 1 year ago • 0 comments

---continuation of https://github.com/meta-llama/llama-stack/pull/288

TL;DR

  • scoring depends on datasetio & datasets
  • Enforce dataset_schema, client enforces dataset schema to follow specified column names/types
  • Add clients for datasets / datasetio / scoring
  • Basic meta-reference impl for scoring/score, scoring/score_batch

Next PR

  • full evals benchmark task with generation
  • LLM as judge scorer impl

Tests

unit tests

PROVIDER_ID=test-meta PROVIDER_CONFIG=llama_stack/providers/tests/scoring/provider_config_example.yaml pytest -s llama_stack/providers/tests/scoring/test_scoring.py --tb=short --disable-warnings
image
PROVIDER_ID=test-meta PROVIDER_CONFIG=llama_stack/providers/tests/datasetio/provider_config_example.yaml pytest -s llama_stack/providers/tests/datasetio/test_datasetio.py --tb=short --disable-warnings

client tests

python -m llama_stack.apis.datasetio.client $LOCALHOST 5000
python -m llama_stack.apis.datasets.client $LOCALHOST 5000
python -m llama_stack.apis.scoring.client $LOCALHOST 5000
image

yanxi0830 avatar Oct 23 '24 20:10 yanxi0830