llm-eval topics

UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform ro...

uptrain-ai

autoevaluation

evaluation

experimentation

llm-eval

promptfoo

3.1k

Stars

205

Forks

Watchers

Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models wit...

llm

athina-evals

140

Stars

11

Forks

Watchers

Python SDK for running evaluations on LLM generated responses

just-eval

63

Stars

4

Forks

Watchers

A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.

Re-Align

evaluation

gpt4

llm

llm-eval

rulm-sbs2

54

Stars

2

Forks

Watchers

Бенчмарк сравнивает русские аналоги ChatGPT: Saiga, YandexGPT, Gigachat

kuk

llm-eval

russian-specific

parea-sdk-py

41

Stars

4

Forks

Watchers

Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)

llm

ragrank

20

Stars

8

Forks

Watchers

🎯 Your free LLM evaluation toolkit helps you assess the accuracy of facts, how well it understands context, its tone, and more. This helps you see how good your LLM applications are.

llm