eval-scope
eval-scope copied to clipboard
HallusionBench数据集的"aAcc","fAcc","qAcc"指标含义
大佬好,请教下,我用evalscope在HallusionBench数据集上测试,报告是 [{'InternVL2-26B-DPO_HallusionBench_score': {'split': 'Overall', 'aAcc': '59.89473684210527', 'fAcc': '34.39306358381503', 'qAcc': '33.62637362637363'}}] 没看懂这几个指标是啥意思。https://github.com/tianyi-lab/HallusionBench 在官方GitHub也没看到
aAcc是Accuracy per Question,也即 average acc fAcc是Accuracy per Figure qAcc是Accuracy per Question Pair
参考:
- 官方文档:https://github.com/tianyi-lab/HallusionBench#metric
- 代码实现:https://github.com/open-compass/VLMEvalKit/blob/0697b148c91da9297ed4c06e664f4fb85a63bb94/vlmeval/dataset/utils/yorn.py#L50