llm-eval topic

List llm-eval repositories

giskard

4.0k
Stars
261
Forks
Watchers

🐢 Open-Source Evaluation & Testing for ML & LLM systems

phoenix

3.6k
Stars
267
Forks
11
Watchers

AI Observability & Evaluation

uptrain

2.2k
Stars
188
Forks
Watchers

UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform ro...

promptfoo

6.9k
Stars
552
Forks
21
Watchers

Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command...

athina-evals

210
Stars
12
Forks
Watchers

Python SDK for running evaluations on LLM generated responses

just-eval

74
Stars
6
Forks
Watchers

A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.

rulm-sbs2

57
Stars
2
Forks
Watchers

Бенчмарк сравнивает русские аналоги ChatGPT: Saiga, YandexGPT, Gigachat

parea-sdk-py

74
Stars
6
Forks
Watchers

Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)

ragrank

25
Stars
11
Forks
Watchers

🎯 Your free LLM evaluation toolkit helps you assess the accuracy of facts, how well it understands context, its tone, and more. This helps you see how good your LLM applications are.

prompto

19
Stars
1
Forks
Watchers

An open source library for asynchronous querying of LLM endpoints