evals icon indicating copy to clipboard operation
evals copied to clipboard

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Results 428 evals issues
Sort by recently updated
recently updated
newest added

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

No offense intended, suggested or implied to Python developers. However, a well known issue with Python, is that once developers learn Python, they tend to not want to learn anything...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

Hi, I'm Quentin from Hugging Face :) I know hosting datasets on github is not always practical: git lfs required, no data preview, limited storage (maybe not for you haha),...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...