Luo Zhen

Results 5 issues of Luo Zhen

### 先决条件 - [x] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [x] 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。 ### 问题类型 我正在使用官方支持的任务/模型/数据集进行评估。 ### 环境 {'CUDA available': False, 'GCC': 'gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0', 'MMEngine': '0.10.7', 'MUSA...

## 问题描述 在对 **ChartQA** 进行评测时,我发现评测框架在部分情况下会因为数值格式差异而出现误判。 ## 示例 **问题:** > What's the percentage of U.S adults who refused? **图表:** ![chart](https://github.com/user-attachments/assets/a3058e16-ce10-4ef4-9c59-149719107783) **模型回答:** `2%` **标准答案:** `2` **评测结果:** `False` ## 说明 该示例中,模型输出的 “2%” 与标准答案...

I followed the steps in the provided link to deploy Qwen3-8B locally as a judge model. However, when evaluating the MMBench_DEV_EN_V11 dataset, an error was thrown at line 263 in...

Hello, I would like to report an issue with the *chembench* dataset: there are two duplicate questions with different UUIDs and conflicting ground truth answers. The two questions are: *...