prompt-testing topic

List prompt-testing repositories

LLM-RGB

105
Stars
6
Forks
Watchers

LLM Reasoning and Generation Benchmark. Evaluate LLMs in complex scenarios systematically.

promptfoo

3.1k
Stars
205
Forks
Watchers

Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models wit...