llm-eval topic

List llm-eval repositories

giskard

3.3k
Stars
215
Forks
Watchers

🐢 Open-Source Evaluation & Testing for LLMs and ML models

phoenix

2.8k
Stars
199
Forks
11
Watchers

AI Observability & Evaluation

uptrain

2.0k
Stars
171
Forks
Watchers

UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform ro...

promptfoo

3.1k
Stars
205
Forks
Watchers

Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models wit...

athina-evals

140
Stars
11
Forks
Watchers

Python SDK for running evaluations on LLM generated responses

just-eval

63
Stars
4
Forks
Watchers

A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.

rulm-sbs2

54
Stars
2
Forks
Watchers

Бенчмарк сравнивает русские аналоги ChatGPT: Saiga, YandexGPT, Gigachat

parea-sdk-py

41
Stars
4
Forks
Watchers

Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)

ragrank

20
Stars
8
Forks
Watchers

🎯 Your free LLM evaluation toolkit helps you assess the accuracy of facts, how well it understands context, its tone, and more. This helps you see how good your LLM applications are.