opencompass
opencompass copied to clipboard
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
### Describe the feature Hi! I'm Sergey from the Integrations team over at [AI/ML API](https://aimlapi.com/), a startup with 150K+ users, providing over 300 AI models in one place Your project...
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...
### 先决条件 - [x] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [x] 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。 ### 问题类型 我正在使用官方支持的任务/模型/数据集进行评估。 ### 环境 {'CUDA available': True, 'CUDA_HOME': '/mnt/petrelfs/share/cuda-12.1', 'GCC': 'gcc (GCC) 9.4.0', 'GPU 0':...
## Motivation This PR adds support for the GrandPhysics dataset to OpenCompass. GrandPhysics is a evaluation dataset (not yet publicly available) built from A Grand Dictionary of Physics Problems and...
### Describe the feature Thanks for your contribution to this all-in-one evaluation kit. I find Qwen3 has used a MMLU variant, [MMLU-redux](https://github.com/aryopg/mmlu-redux), to conduct evaluation. But I can't find any...
### Prerequisite - [x] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [x] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...
### Prerequisite - [x] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [x] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...
### Prerequisite - [x] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [x] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...
### Describe the feature 我在执行下面命令下载数据集源文件的时候,发现部分测评集无法自动下载源文件,请问如果发生这样的情形是需要自己搜索对应的测评集源文件吗(我有找过,但是其中还是有一部分找不到)?如果是这样怎么确定自己的源文件就是该测评集的测试文件,就是说是正确对应上的。 python run.py --models model_name--datasets dataset_name --debug --dry-run ### Will you implement it? - [x] I would like to implement this feature and create a PR!
### Prerequisite - [x] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [x] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...