prompt-testing topic
List
prompt-testing repositories
LLM-RGB
105
Stars
6
Forks
Watchers
LLM Reasoning and Generation Benchmark. Evaluate LLMs in complex scenarios systematically.
promptfoo
3.1k
Stars
205
Forks
Watchers
Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models wit...