llm-as-evaluator topic

List llm-as-evaluator repositories

langtest

496
Stars
39
Forks
Watchers

Deliver safe & effective language models

LLM-IR-Bias-Fairness-Survey

39
Stars
3
Forks
Watchers

This is the repo for the survey of Bias and Fairness in IR with LLMs.

Timo

18
Stars
1
Forks
Watchers

Timo: Towards Better Temporal Reasoning for Language Models (COLM 2024)

prometheus-eval

767
Stars
47
Forks
Watchers

Evaluate your LLM's response with Prometheus and GPT4 💯

cobbler

15
Stars
1
Forks
Watchers

Code and data for Koo et al's ACL 2024 paper "Benchmarking Cognitive Biases in Large Language Models as Evaluators"