evals icon indicating copy to clipboard operation
evals copied to clipboard

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Results 428 evals issues
Sort by recently updated
recently updated
newest added

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

📁 The OpenAI/evals repository has a problem with where the issue templates are stored, causing issues for contributors who are trying to create new issues. I am submitting a pull...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Addressing the elephant in the room When the concept of transformers were first unleashed, their revolutionnary accuracy results where mostly shown in the standard NLP tasks, such as POS-tagging,...

Idea for Eval

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...