opencompass
opencompass copied to clipboard
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。 ### 问题类型 我正在使用官方支持的任务/模型/数据集进行评估。 ### 环境 --datasets commonsenseqa_gen --hf-path llm_mode_debug/Meta-Llama-3.1-8B-Instruct/ --tokenizer-path llm_mode_debug/Meta-Llama-3.1-8B-Instruct/ --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs...
### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...
## Motivation This is a fix for the [issue](https://github.com/open-compass/opencompass/issues/1362). This issue describes a problem where specifying a `system_prompt` in the `meta_template` does not work as expected because `APITemplateParser` does not...
### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...
我域内有机器部署了vllm 的 Qwen2-7b 的模型服务,我想调用该机器的模型服务,可以直接curl就能访问到服务,请问如何部署评测。
### Describe the feature [Feature] opencompass有计划支持CS-Bench数据集评测吗? ### Will you implement it? - [X] I would like to implement this feature and create a PR!
想请问下,我看最新的CompassRank 评测榜单分为了大语言模型官方自建榜单和大语言模型公开学术榜单,公开学术榜单里面之前一些其他的公开评测集结果好像没有了,请问在哪里可以看到 这个结果对我们还是比较重要,我们集成评测集的结果都是参考opencompass的结果
### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。 ### 问题类型 我修改了代码(配置不视为代码),或者我正在处理我自己的任务/模型/数据集。 ### 环境 {'CUDA available': True, 'CUDA_HOME': '/usr/local/cuda', 'GCC': 'gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0', 'GPU...
### 描述该功能 求配置mmlu_pro数据集的代码逻辑~ ### 是否希望自己实现该功能? - [ ] 我希望自己来实现这一功能,并向 OpenCompass 贡献代码!