opencompass icon indicating copy to clipboard operation
opencompass copied to clipboard

[Bug] local datasets loading error

Open Lizlone opened this issue 3 months ago • 0 comments

Prerequisite

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

{'CUDA available': False, 'GCC': 'n/a', 'MMEngine': '0.10.7', 'MSVC': '用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.43.34810 版', 'MUSA available': False, 'OpenCV': '4.11.0', 'PyTorch': '2.8.0+cpu', 'PyTorch compiling details': 'PyTorch built with:\n' ' - C++ Version: 201703\n' ' - MSVC 193833145\n' ' - Intel(R) oneAPI Math Kernel Library Version ' '2025.2-Product Build 20250620 for Intel(R) 64 ' 'architecture applications\n' ' - Intel(R) MKL-DNN v3.7.1 (Git Hash ' '8d263e693366ef8db40acc569cc7d8edf644556d)\n' ' - OpenMP 2019\n' ' - LAPACK is enabled (usually provided by ' 'MKL)\n' ' - CPU capability usage: AVX512\n' ' - Build settings: BLAS_INFO=mkl, ' 'BUILD_TYPE=Release, ' 'COMMIT_SHA=a1cb3cc05d46d198467bebbb6e8fba50a325d4e7, ' 'CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/pytorch/.ci/pytorch/windows/tmp_bin/sccache-cl.exe, ' 'CXX_FLAGS=/DWIN32 /D_WINDOWS /EHsc ' '/Zc:__cplusplus /bigobj /FS /utf-8 ' '-DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO ' '-DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER ' '-DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM ' '-DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE ' '/wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 ' '/wd4244 /wd4804 /wd4273, LAPACK_INFO=mkl, ' 'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, ' 'TORCH_VERSION=2.8.0, USE_CUDA=0, USE_CUDNN=OFF, ' 'USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, ' 'USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, ' 'USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, ' 'USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, ' 'USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, ' 'USE_XPU=OFF, \n', 'Python': '3.10.18 | packaged by conda-forge | (main, Jun 4 2025, 14:42:04) ' '[MSC v.1943 64 bit (AMD64)]', 'lmdeploy': "not installed:No module named 'lmdeploy'", 'numpy_random_seed': 2147483648, 'opencompass': '0.4.2+', 'sys.platform': 'win32', 'transformers': '4.48.0'}

Reproduces the problem - code/configuration sample

依照教程创建文件

from mmengine.config import read_base
from opencompass.models import OpenAISDK

# 配置模型
models = [
    dict(
        type=OpenAISDK,
        path='internlm3-latest',  # 请求服务时的 model name
        key='key',
        openai_api_base='https://internlm-chat.intern-ai.org.cn/puyu/api/v1/',  # API 地址
        rpm_verbose=True,
        query_per_second=0.16,
        max_out_len=1024,
        max_seq_len=4096,
        temperature=0.01,
        batch_size=1,
        retry=3,
    )
]

# 配置数据集
datasets = [
    dict(
        path='newformat_sft_test_data.csv',
        data_type='mcq',
        infer_method='gen'
    )
]

Reproduces the problem - command or script

python run.py opencompass/configs/eval_tutorial_demo3.py --debug

Reproduces the problem - error message

File "opencompass\opencompass\utils\datasets.py", line 48, in get_data_path local_path = DATASETS_MAPPING[dataset_id]['local'] KeyError: 'newformat_sft_test_data.csv'

Other information

上述教程相关的代码中

datasets = [
    dict(
        path='newformat_sft_test_data.csv',
        data_type='mcq',
        infer_method='gen'
    )
]

若path是linux环境下带绝对路径的"/root/opencompass/newformat_sft_test_data.csv"则可以执行。

而相对路径、Windows环境下的路径都不能读取数据集 这是因为opencompass/opencompass/utils/datasets.py第22行

    # For absolute path customized by the users
    if dataset_id.startswith('/'):
        return dataset_id

导致linux环境下的绝对路径下的数据集能被读取而其他都不行。 对于该教程的方法想要读取本地数据集还有问题,希望能够改善。

Lizlone avatar Aug 14 '25 18:08 Lizlone