opencompass icon indicating copy to clipboard operation
opencompass copied to clipboard

[Bug] Unexpected version in summary, when using summarizers.leaderboard, for some datasets (chid, race, csl, eprstmt).

Open Reeleon opened this issue 1 year ago • 0 comments

Prerequisite

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

opencompass verison: v0.2.0 data source: OpenCompassData-complete-20231110

Reproduces the problem - code/configuration sample

Part of my config file (only show the datasets related to this bug):

from mmengine.config import read_base

with read_base():
    from .summarizers.leaderboard import summarizer

    from .datasets.FewCLUE_chid.FewCLUE_chid_ppl_8f2872 import chid_datasets
    from .datasets.race.race_ppl_5831a0 import race_datasets
    from .datasets.FewCLUE_csl.FewCLUE_csl_ppl_841b62 import csl_datasets
    from .datasets.FewCLUE_eprstmt.FewCLUE_eprstmt_ppl_f1e631 import eprstmt_datasets

datasets = sum((v for k, v in locals().items() if k.endswith('_datasets')), [])

Reproduces the problem - command or script

python run.py configs/myeval.py

Reproduces the problem - error message

No error message.

Other information

Part of the summary_[timestamp.txt] (only show the datasets related to this bug):

dataset                                 version    metric            mode  
--------------------------------------  ---------  ----------------  ------
--------- 考试 Exam ---------           -          -                 -      

--------- 语言 Language ---------       -          -                 -      
chid-dev                                90451d     accuracy          ppl    
--------- 知识 Knowledge ---------      -          -                 -      

--------- 理解 Understanding ---------  -          -                 -      
race-middle                             dda65f     accuracy          ppl    
race-high                               dda65f     accuracy          ppl    
csl_dev                                 46f772     accuracy          ppl    
eprstmt-dev                             ca49a2     accuracy          ppl    
--------- 推理 Reasoning ---------      -          -                 -     

These datasets's version in summary is not same as their version used in the config file, while other datasets's version is. And the versions above are not appear in the corresponding configs/datasets/[dataset], for example, there is no "race_ppl_5831a0.py" in "configs/datasets/race".

Reeleon avatar Jan 02 '24 07:01 Reeleon