opencompass issues

Support devops-eval

1

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...

PanameraXXX

[Feature] ModuleNotFoundError: No module named 'opencompass.datasets.lawbench.evaluation_functions'

2

### Describe the feature 我执行命令python run.py --datasets ceval_ppl mmlu_ppl --hf-path /T106/LLM_model/llama-7b --model-kwargs device_map='auto' --tokenizer-kwargs padding_side='left' truncation='left' use_fast=False --max-out-len 100 --max-seq-len 2048 --batch-size 8 --no-batch-padding --num-gpus 1 ................................... 98%|████████████████████████████████████████████████████████████████████████████████████████████▎ | 107/109...

plutoda588

[Feature] Is there any data management tool for mmlab?

### Describe the feature Hi mmlab members, Is there any open source NLP data management tool developed by mmlab? Thanks ### Will you implement it? - [ ] I would...

starlitsky2010

[Feature] Incorrect config file link on leaderboard.

1

### Describe the feature On the [LLM leaderoard](https://opencompass.org.cn/leaderboard-llm), some scores are linked to incorrect config file. Just click the button for "View the configuration file for this score" and confirm...

Reeleon

[Bug] Unexpected version in summary, when using summarizers.leaderboard, for some datasets (chid, race, csl, eprstmt).

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [ ] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ###...

Reeleon

[Bug] When using "batch adding=True", there is a significant decrease in the performance of the generated task

### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。 ### 问题类型我正在使用官方支持的任务/模型/数据集进行评估。 ### 环境 {'CUDA available': True, 'CUDA_HOME': '/home/yangzhao/cuda-11.8', 'GCC': 'gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0', 'GPU...

zy125413

FileNotFoundError: Couldn't find a module script at /data/opencompass/bleurt/bleurt.py. Module 'bleurt' doesn't exist on the Hugging Face Hub either.[Bug]

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...

Modas-Li

[Bug] run.py configs/eval_subjective_score.py report "AttributeError: can't set attribute 'pad_token_id'"

1

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. - [X] The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). ### Type...

e06084

[Feature] Evaluation detail results of humaneval and evalplus

### Describe the feature - https://github.com/open-compass/opencompass/pull/720#issuecomment-1863958692 @jingmingzhuo ### Will you implement it? - [ ] I would like to implement this feature and create a PR!

tonysy

[Feature] Some suggestion about design

### 描述该功能 https://github.com/open-compass/opencompass/blob/637628a70fc708057cfd6dfe8717ca9035553bc8/opencompass/tasks/openicl_eval.py#L127-L149 这一段的逻辑是不是可以放在 https://github.com/open-compass/opencompass/blob/97c2068bd9b21ac2b30177db6531554f4695bc51/opencompass/models/base.py#L132 里？ `_extract_role_pred` 看上去将Chat模型的回答中提取出 begin_token 与 end_token 中间的部分，放在**模型中似乎更合理**。 'pred_role' 看上去只是指示使用 meta_tmplate中的哪一个角色的begin_token\end_token，本质上还是使用最后一段话，我认为不如直接约定为 'BOT' 或者 ‘assistant’的begin_token\end_token。考虑到的点： 1. openicl_eval 里的这段逻辑有些奇怪，放在model里面合理很多。 2. predction中不会存在特殊的toekn 3. chatinferencer 使用时不用在infer中间去除这些特殊的token ### 是否希望自己实现该功能？ -...

Ezra-Yu

opencompass
opencompass copied to clipboard

Metadata

Support devops-eval

[Feature] ModuleNotFoundError: No module named 'opencompass.datasets.lawbench.evaluation_functions'

[Feature] Is there any data management tool for mmlab?

[Feature] Incorrect config file link on leaderboard.

[Bug] Unexpected version in summary, when using summarizers.leaderboard, for some datasets (chid, race, csl, eprstmt).

[Bug] When using "batch adding=True", there is a significant decrease in the performance of the generated task

FileNotFoundError: Couldn't find a module script at /data/opencompass/bleurt/bleurt.py. Module 'bleurt' doesn't exist on the Hugging Face Hub either.[Bug]

[Bug] run.py configs/eval_subjective_score.py report "AttributeError: can't set attribute 'pad_token_id'"

[Feature] Evaluation detail results of humaneval and evalplus

[Feature] Some suggestion about design

← Metadata

Owner

Metadata

opencompass opencompass copied to clipboard

Metadata

← Metadata

Owner

Metadata

opencompass
opencompass copied to clipboard