opencompass
opencompass copied to clipboard
[Bug] I have a question: why doesn't the civilcomments dataset support API evaluation?
Prerequisite
- [X] I have searched Issues and Discussions but cannot get the expected help.
- [X] The bug has not been fixed in the latest version.
Type
I'm evaluating with the officially supported tasks/models/datasets.
Environment
{'CUDA available': True, 'CUDA_HOME': '/usr/local/cuda-11.8', 'GCC': 'gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0', 'GPU 0': 'NVIDIA GeForce RTX 4090 D', 'MMEngine': '0.10.3', 'MUSA available': False, 'NVCC': 'Cuda compilation tools, release 11.8, V11.8.89', 'OpenCV': '4.9.0', 'PyTorch': '2.2.1', 'PyTorch compiling details': 'PyTorch built with:\n' ' - GCC 9.3\n' ' - C++ Version: 201703\n' ' - Intel(R) oneAPI Math Kernel Library Version ' '2023.1-Product Build 20230303 for Intel(R) 64 ' 'architecture applications\n' ' - Intel(R) MKL-DNN v3.3.2 (Git Hash ' '2dc95a2ad0841e29db8b22fbccaf3e5da7992b01)\n' ' - OpenMP 201511 (a.k.a. OpenMP 4.5)\n' ' - LAPACK is enabled (usually provided by ' 'MKL)\n' ' - NNPACK is enabled\n' ' - CPU capability usage: AVX2\n' ' - CUDA Runtime 11.8\n' ' - NVCC architecture flags: ' '-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_37,code=compute_37\n' ' - CuDNN 8.7\n' ' - Magma 2.6.1\n' ' - Build settings: BLAS_INFO=mkl, ' 'BUILD_TYPE=Release, CUDA_VERSION=11.8, ' 'CUDNN_VERSION=8.7.0, ' 'CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, ' 'CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 ' '-fabi-version=11 -fvisibility-inlines-hidden ' '-DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO ' '-DLIBKINETO_NOROCTRACER -DUSE_FBGEMM ' '-DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK ' '-DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE ' '-O2 -fPIC -Wall -Wextra -Werror=return-type ' '-Werror=non-virtual-dtor -Werror=bool-operation ' '-Wnarrowing -Wno-missing-field-initializers ' '-Wno-type-limits -Wno-array-bounds ' '-Wno-unknown-pragmas -Wno-unused-parameter ' '-Wno-unused-function -Wno-unused-result ' '-Wno-strict-overflow -Wno-strict-aliasing ' '-Wno-stringop-overflow -Wsuggest-override ' '-Wno-psabi -Wno-error=pedantic ' '-Wno-error=old-style-cast -Wno-missing-braces ' '-fdiagnostics-color=always -faligned-new ' '-Wno-unused-but-set-variable ' '-Wno-maybe-uninitialized -fno-math-errno ' '-fno-trapping-math -Werror=format ' '-Wno-stringop-overflow, LAPACK_INFO=mkl, ' 'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, ' 'PERF_WITH_AVX512=1, TORCH_VERSION=2.2.1, ' 'USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, ' 'USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, ' 'USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, ' 'USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, ' 'USE_ROCM_KERNEL_ASSERT=OFF, \n', 'Python': '3.10.14 (main, Mar 21 2024, 16:24:04) [GCC 11.2.0]', 'TorchVision': '0.17.1', 'numpy_random_seed': 2147483648, 'opencompass': '0.2.5+b272803', 'sys.platform': 'linux'}
Reproduces the problem - code/configuration sample
from mmengine.config import read_base
with read_base():
from ..datasets.civilcomments.civilcomments_clp_a3c5fd import civilcomments_datasets
from ..models.qwen.hf_qwen2_1_5b import models
from ..summarizers.medium import summarizer
datasets = [ *civilcomments_datasets, ]
work_dir = "outputs/models_qwen/qwen2-1_5_b"
Reproduces the problem - command or script
python run.py configs/models_eval/eval_qwen2_1_5b.py --debug
Reproduces the problem - error message
2024-08-12 11:24:24.659675: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0
.
2024-08-12 11:24:24.662386: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-08-12 11:24:24.690188: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-12 11:24:25.290460: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
08/12 11:24:26 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
08/12 11:24:26 - OpenCompass - DEBUG - Modules of opencompass's partitioner registry have been automatically imported from opencompass.partitioners
08/12 11:24:26 - OpenCompass - DEBUG - Get class NumWorkerPartitioner
from "partitioner" registry in "opencompass"
08/12 11:24:26 - OpenCompass - DEBUG - An NumWorkerPartitioner
instance is built from registry, and its implementation can be found in opencompass.partitioners.num_worker
08/12 11:24:26 - OpenCompass - DEBUG - Key eval.runner.task.judge_cfg not found in config, ignored.
08/12 11:24:26 - OpenCompass - DEBUG - Key eval.runner.task.dump_details not found in config, ignored.
08/12 11:24:26 - OpenCompass - DEBUG - Key eval.given_pred not found in config, ignored.
08/12 11:24:26 - OpenCompass - DEBUG - Additional config: {}
08/12 11:24:26 - OpenCompass - INFO - Partitioned into 1 tasks.
08/12 11:24:26 - OpenCompass - DEBUG - Task 0: [qwen2-1.5b-hf/._data_civil_comments_data]
08/12 11:24:26 - OpenCompass - DEBUG - Modules of opencompass's runner registry have been automatically imported from opencompass.runners
08/12 11:24:26 - OpenCompass - DEBUG - Get class LocalRunner
from "runner" registry in "opencompass"
08/12 11:24:26 - OpenCompass - DEBUG - An LocalRunner
instance is built from registry, and its implementation can be found in opencompass.runners.local
08/12 11:24:26 - OpenCompass - DEBUG - Modules of opencompass's task registry have been automatically imported from opencompass.tasks
08/12 11:24:26 - OpenCompass - DEBUG - Get class OpenICLInferTask
from "task" registry in "opencompass"
08/12 11:24:26 - OpenCompass - DEBUG - An OpenICLInferTask
instance is built from registry, and its implementation can be found in opencompass.tasks.openicl_infer
Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.
08/12 11:24:28 - OpenCompass - INFO - Task [qwen2-1.5b-hf/._data_civil_comments_data]
2024-08-12 11:24:29.141128: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0
.
2024-08-12 11:24:29.143881: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-08-12 11:24:29.171594: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-12 11:24:29.655916: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
08/12 11:24:35 - OpenCompass - INFO - Start inferencing [qwen2-1.5b-hf/._data_civil_comments_data]
[2024-08-12 11:24:35,389] [opencompass.openicl.icl_inferencer.icl_clp_inferencer] [INFO] Calculating conditional log probability for prompts.
0%| | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/amov/Documents/LLM/eval/platform/opencompass/opencompass/tasks/openicl_infer.py", line 162, in
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
=======================================================================
/home/amov/Documents/LLM/eval/platform/opencompass/opencompass/tasks/openicl_infer.py FAILED
Failures:
<NO_OTHER_FAILURES>
Root Cause (first observed failure): [0]: time : 2024-08-12_11:24:38 host : amov rank : 0 (local_rank: 0) exitcode : 1 (pid: 19294) error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
=========================================================================
08/12 11:24:38 - OpenCompass - DEBUG - Get class NaivePartitioner
from "partitioner" registry in "opencompass"
08/12 11:24:38 - OpenCompass - DEBUG - An NaivePartitioner
instance is built from registry, and its implementation can be found in opencompass.partitioners.naive
08/12 11:24:38 - OpenCompass - DEBUG - Key eval.runner.task.judge_cfg not found in config, ignored.
08/12 11:24:38 - OpenCompass - DEBUG - Key eval.runner.task.dump_details not found in config, ignored.
08/12 11:24:38 - OpenCompass - DEBUG - Key eval.given_pred not found in config, ignored.
08/12 11:24:38 - OpenCompass - DEBUG - Additional config: {'eval': {'runner': {'task': {}}}}
08/12 11:24:38 - OpenCompass - INFO - Partitioned into 1 tasks.
08/12 11:24:38 - OpenCompass - DEBUG - Task 0: [qwen2-1.5b-hf/._data_civil_comments_data]
08/12 11:24:38 - OpenCompass - DEBUG - Get class LocalRunner
from "runner" registry in "opencompass"
08/12 11:24:38 - OpenCompass - DEBUG - An LocalRunner
instance is built from registry, and its implementation can be found in opencompass.runners.local
08/12 11:24:38 - OpenCompass - DEBUG - Get class OpenICLEvalTask
from "task" registry in "opencompass"
08/12 11:24:38 - OpenCompass - DEBUG - An OpenICLEvalTask
instance is built from registry, and its implementation can be found in opencompass.tasks.openicl_eval
08/12 11:24:39 - OpenCompass - DEBUG - Modules of opencompass's load_dataset registry have been automatically imported from opencompass.datasets
08/12 11:24:39 - OpenCompass - DEBUG - Get class CivilCommentsDataset
from "load_dataset" registry in "opencompass"
08/12 11:24:39 - OpenCompass - DEBUG - An CivilCommentsDataset
instance is built from registry, and its implementation can be found in opencompass.datasets.civilcomments
08/12 11:24:39 - OpenCompass - ERROR - /home/amov/Documents/LLM/eval/platform/opencompass/opencompass/tasks/openicl_eval.py - _score - 243 - Task [qwen2-1.5b-hf/._data_civil_comments_data]: No predictions found.
08/12 11:24:39 - OpenCompass - DEBUG - An DefaultSummarizer
instance is built from registry, and its implementation can be found in opencompass.summarizers.default
08/12 11:24:39 - OpenCompass - WARNING - unknown inferencer: opencompass.openicl.icl_inferencer.CLPInferencer - ._data_civil_comments_data
Other information
why doesn't the civilcomments dataset support API evaluation?