opencompass icon indicating copy to clipboard operation
opencompass copied to clipboard

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Results 261 opencompass issues
Sort by recently updated
recently updated
newest added

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...

### 描述该功能 我在评测时的模型type 为vllm,参数如下: ![image](https://github.com/open-compass/opencompass/assets/97608046/851bccbb-1f7f-420c-b7ca-fc00677a12cf) 但是显卡占用只使用了一张卡来评测任务 ![image](https://github.com/open-compass/opencompass/assets/97608046/4e7cedc0-2eef-4f2e-9e1b-1d4f662f6b7b) 我想让任务划分为几份分别在8张卡上评测,这种功能可以添加吗?还是说可以实现,麻烦解答一下。非常感激! 类似我如果设定为模型type为HF的话,会自动达到这种效果。 ![image](https://github.com/open-compass/opencompass/assets/97608046/4517c3fb-0c90-4c9e-9bea-170f7e397fb9) ![image](https://github.com/open-compass/opencompass/assets/97608046/0771ecf5-16b2-4640-a4b0-c9a436b9476d) ### 是否希望自己实现该功能? - [ ] 我希望自己来实现这一功能,并向 OpenCompass 贡献代码!

### 描述该功能 ollama直接发布本地模型非常方便,如何对这种api进行评估,是否可以给一个例子。 ### 是否希望自己实现该功能? - [ ] 我希望自己来实现这一功能,并向 OpenCompass 贡献代码!

### Describe the feature ![image](https://github.com/open-compass/opencompass/assets/97608046/1f85dc02-373a-4d2b-a1da-ca2ce0b153d0) ![image](https://github.com/open-compass/opencompass/assets/97608046/da63e2a7-a0b7-4ee3-9923-7c5ecf0c84b6) 在batch_size 分别为128,64,16的情况下,deepseek 1.3B 的P@1 分别是31.71、30.49、29.27 请问这是为什么? ### Will you implement it? - [ ] I would like to implement this feature and create a...

### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。 ### 问题类型 我正在使用官方支持的任务/模型/数据集进行评估。 ### 环境 {'CUDA available': True, 'CUDA_HOME': '/usr/local/cuda', 'GCC': 'gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0', 'GPU...

…l eval Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do...

### 描述该功能 with choice和wo choice差几十个点,希望可以对齐llama3中的评估方式 https://github.com/meta-llama/llama3/blob/main/eval_details.md ### 是否希望自己实现该功能? - [ ] 我希望自己来实现这一功能,并向 OpenCompass 贡献代码!

### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。 ### 问题类型 我正在使用官方支持的任务/模型/数据集进行评估。 ### 环境 环境正确,将max_seq_len设置为16k时可单卡正常推理,设置为32k时内存溢出。 ### 重现问题 - 代码/配置示例 无 ### 重现问题 - 命令或脚本 python...

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...