llm-evaluation-toolkit topic

List llm-evaluation-toolkit repositories

athina-evals

140
Stars
11
Forks
Watchers

Python SDK for running evaluations on LLM generated responses

just-eval

63
Stars
4
Forks
Watchers

A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.

parea-sdk-py

41
Stars
4
Forks
Watchers

Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)