opencompass icon indicating copy to clipboard operation
opencompass copied to clipboard

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Results 261 opencompass issues
Sort by recently updated
recently updated
newest added

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...

### Describe the feature 我执行命令python run.py --datasets ceval_ppl mmlu_ppl --hf-path /T106/LLM_model/llama-7b --model-kwargs device_map='auto' --tokenizer-kwargs padding_side='left' truncation='left' use_fast=False --max-out-len 100 --max-seq-len 2048 --batch-size 8 --no-batch-padding --num-gpus 1 ................................... 98%|████████████████████████████████████████████████████████████████████████████████████████████▎ | 107/109...

### Describe the feature Hi mmlab members, Is there any open source NLP data management tool developed by mmlab? Thanks ### Will you implement it? - [ ] I would...

### Describe the feature On the [LLM leaderoard](https://opencompass.org.cn/leaderboard-llm), some scores are linked to incorrect config file. Just click the button for "View the configuration file for this score" and confirm...

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [ ] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ###...

### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。 ### 问题类型 我正在使用官方支持的任务/模型/数据集进行评估。 ### 环境 {'CUDA available': True, 'CUDA_HOME': '/home/yangzhao/cuda-11.8', 'GCC': 'gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0', 'GPU...

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...

### Describe the feature - https://github.com/open-compass/opencompass/pull/720#issuecomment-1863958692 @jingmingzhuo ### Will you implement it? - [ ] I would like to implement this feature and create a PR!

### 描述该功能 https://github.com/open-compass/opencompass/blob/637628a70fc708057cfd6dfe8717ca9035553bc8/opencompass/tasks/openicl_eval.py#L127-L149 这一段的逻辑是不是可以放在 https://github.com/open-compass/opencompass/blob/97c2068bd9b21ac2b30177db6531554f4695bc51/opencompass/models/base.py#L132 里? `_extract_role_pred` 看上去将Chat模型的回答中提取出 begin_token 与 end_token 中间的部分,放在**模型中似乎更合理**。 'pred_role' 看上去只是指示使用 meta_tmplate中的哪一个角色的begin_token\end_token,本质上还是使用最后一段话,我认为不如直接约定为 'BOT' 或者 ‘assistant’的begin_token\end_token。 考虑到的点: 1. openicl_eval 里的这段逻辑有些奇怪,放在model里面合理很多。 2. predction中不会存在特殊的toekn 3. chatinferencer 使用时不用在infer中间去除这些特殊的token ### 是否希望自己实现该功能? -...