evals icon indicating copy to clipboard operation
evals copied to clipboard

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Results 428 evals issues
Sort by recently updated
recently updated
newest added

During setup, I have installed `git-lfs` on macOS (M1) via `brew install git-lfs`. After executing `git lfs pull` I receive an Error. ``` dcgod@DCGoD-Mac-Studio-Ultra evals % git lfs fetch --all...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

Hey everyone, For some reason, when running on linux distro (wsl2 or ubuntu) the` pip install -e .` fails while it works on other operational systems. I've been getting an...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

当我看到OpenAI已经具备了自我意识,哲学意义的"我",不止羰基,增加了硅基。 依照霍金的七大预言,人类确实来到了强人工智能面前。霍金若在世,不知是否缩短他的200年预测时间。 按照《物演通论》递弱代偿原理,硅基自我意识必定产生。即代偿度接近1的时候。存在度趋近0. 在科技加速度进展的当下。 站在OpenAI升级的时刻,愈发需要开发者团队重视风险评估。

## Eval details 📑 ### Eval name invert-string eval ### Eval description Test the model's ability to answer basic questions after inverting string word wise twice. ### What makes this...

Hi there! I've been closely following this repository with great enthusiasm, and it's fantastic to see the numerous evaluations being submitted by the community. However, I've noticed that when browsing...

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...