giraffe
giraffe copied to clipboard
ECHOscore: A smarter, lighter LLM evaluation tool 🎯
Hi there! 👋 I noticed that your repo includes LLM evaluation logic like eval.py, custom scoring, or ChatCompletion usage.
If you're working on model evaluation, you might be interested in ECHOscore — a lightweight and flexible evaluation framework.
✨ ECHOscore lets you:
- Compare Prompt + Output pairs
- Structure evaluations with your own criteria (rubric-style)
- Visualize scores (in the full version)
- Customize how you assess LLM behavior
It's perfect for small teams, solo devs, or anyone who finds LangSmith too heavy for quick iteration.
🔍 Try the demo here: https://echoscore.ai
Feel free to check it out — feedback is always welcome!🐺