evaluation-metrics topic

List evaluation-metrics repositories

ctc-gen-eval

94
Stars
9
Forks
Watchers

EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation

StreamingRec

43
Stars
15
Forks
Watchers

A news recommendation evaluation framework

agentops

5.2k
Stars
505
Forks
5.2k
Watchers

Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including CrewAI, Agno, OpenAI Agents SDK, Langchain, Autogen, AG2, and Ca...

TaPR

16
Stars
6
Forks
Watchers

Time-series Aware Precision and Recall for Evaluating Anomaly Detection Methods

CERberus

23
Stars
0
Forks
Watchers

CERberus -- guardian against character errors :dog::dog::dog:

chatgpt_as_nlg_evaluator

41
Stars
1
Forks
Watchers

Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study

ErrorAnalysis_Prompt

84
Stars
3
Forks
Watchers

:gift:[ChatGPT4MTevaluation] ErrorAnalysis Prompt for MT Evaluation in ChatGPT

continuous-eval

436
Stars
28
Forks
Watchers

Data-Driven Evaluation for LLM-Powered Applications

summarization-eval

99
Stars
7
Forks
Watchers

📝 Reference-Free automatic summarization evaluation with potential hallucination detection

summary-workbench

31
Stars
6
Forks
Watchers

Framework for unified summarisation and evaluation of English documents using state-of-the-art models and measures.