opencompass icon indicating copy to clipboard operation
opencompass copied to clipboard

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Results 261 opencompass issues
Sort by recently updated
recently updated
newest added

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...

### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。 ### 问题类型 我正在使用官方支持的任务/模型/数据集进行评估。 ### 环境 python 3.11 ### 重现问题 - 代码/配置示例 . ### 重现问题 - 命令或脚本...

## Motivation 增加SciCode数据集 ## Modification 1. 增加[opencompass/datasets/SciCode.py](https://github.com/open-compass/opencompass/compare/main...HariSeldon0:opencompass:add_SciCode?expand=1#diff-2c86f8ce5eee7e5ba74aab1c229c228b061f60bf9d3d8e057f2f9108f0e85e97):实现SciCodeDataset类和SciCodeEvaluator类 2. 增加SciCode数据集配置文件,在readme中给出了qwen2和llama3的参考结果 3. 修改[openicl/icl_inferencer/icl_chat_inferencer.py](https://github.com/open-compass/opencompass/compare/main...HariSeldon0:opencompass:add_SciCode?expand=1#diff-d79f68839eec4ede64ce702fd5cd1e09e6b2e724c56a3d359779ba488b090ba1):更改了一处逻辑不合理的地方 **理由**:函数inference每次调用infer_every处理一个测试点,参数index含义为第index个测试点;一个测试点为一次多轮对话,infer_every使用循环来处理每一轮对话,因此在infer_every中不应该对index进行递增(简单而言,index是外层循环的索引,不应该在内层循环中递增)。另外,结合函数output_handler.save_multiround_results,若对index递增会使保存的对话结果位置错乱,例如会把第1个测试点第2轮对话结果保存到第2个测试点中,存在问题。 ## 数据集路径 不含背景的数据集暂存到:https://drive.google.com/file/d/1CSqDzKZ7tKYqr2SpUl1z3HEBlN9-qP_i/view?usp=sharing 含有背景的数据集暂存到:https://drive.google.com/file/d/1_BYBlPZCvuKccIiqEA_T84-t1ZVRGXpX/view?usp=sharing ## Checklist **Before PR**: - [ ] Pre-commit or other linting tools are used...

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...

### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。 ### 问题类型 我正在使用官方支持的任务/模型/数据集进行评估。 ### 环境 准确率可能错了,我看json文件大部分都对了,但准确率只有36% [ARC-e-rate.json](https://github.com/user-attachments/files/16517610/ARC-e-rate.json) [ARC-e.json](https://github.com/user-attachments/files/16517611/ARC-e.json) ### 重现问题 - 代码/配置示例 python run.py configs/eval_demo.py ###...

1. Update option postprocess for extract option from text like " The correct answer option is \\boxed{ABCD..}" (Qwen2-Math has this feature) 2. Fix mathbench language summarizer errors

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...