opencompass [Bug] I have a question: why doesn't the civilcomments dataset support API evaluation?

[Bug] I have a question: why doesn't the civilcomments dataset support API evaluation?

Open cdp-study opened this issue 6 months ago • 0 comments

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] The bug has not been fixed in the latest version.

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

{'CUDA available': True, 'CUDA_HOME': '/usr/local/cuda-11.8', 'GCC': 'gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0', 'GPU 0': 'NVIDIA GeForce RTX 4090 D', 'MMEngine': '0.10.3', 'MUSA available': False, 'NVCC': 'Cuda compilation tools, release 11.8, V11.8.89', 'OpenCV': '4.9.0', 'PyTorch': '2.2.1', 'PyTorch compiling details': 'PyTorch built with:\n' ' - GCC 9.3\n' ' - C++ Version: 201703\n' ' - Intel(R) oneAPI Math Kernel Library Version ' '2023.1-Product Build 20230303 for Intel(R) 64 ' 'architecture applications\n' ' - Intel(R) MKL-DNN v3.3.2 (Git Hash ' '2dc95a2ad0841e29db8b22fbccaf3e5da7992b01)\n' ' - OpenMP 201511 (a.k.a. OpenMP 4.5)\n' ' - LAPACK is enabled (usually provided by ' 'MKL)\n' ' - NNPACK is enabled\n' ' - CPU capability usage: AVX2\n' ' - CUDA Runtime 11.8\n' ' - NVCC architecture flags: ' '-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_37,code=compute_37\n' ' - CuDNN 8.7\n' ' - Magma 2.6.1\n' ' - Build settings: BLAS_INFO=mkl, ' 'BUILD_TYPE=Release, CUDA_VERSION=11.8, ' 'CUDNN_VERSION=8.7.0, ' 'CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, ' 'CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 ' '-fabi-version=11 -fvisibility-inlines-hidden ' '-DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO ' '-DLIBKINETO_NOROCTRACER -DUSE_FBGEMM ' '-DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK ' '-DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE ' '-O2 -fPIC -Wall -Wextra -Werror=return-type ' '-Werror=non-virtual-dtor -Werror=bool-operation ' '-Wnarrowing -Wno-missing-field-initializers ' '-Wno-type-limits -Wno-array-bounds ' '-Wno-unknown-pragmas -Wno-unused-parameter ' '-Wno-unused-function -Wno-unused-result ' '-Wno-strict-overflow -Wno-strict-aliasing ' '-Wno-stringop-overflow -Wsuggest-override ' '-Wno-psabi -Wno-error=pedantic ' '-Wno-error=old-style-cast -Wno-missing-braces ' '-fdiagnostics-color=always -faligned-new ' '-Wno-unused-but-set-variable ' '-Wno-maybe-uninitialized -fno-math-errno ' '-fno-trapping-math -Werror=format ' '-Wno-stringop-overflow, LAPACK_INFO=mkl, ' 'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, ' 'PERF_WITH_AVX512=1, TORCH_VERSION=2.2.1, ' 'USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, ' 'USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, ' 'USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, ' 'USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, ' 'USE_ROCM_KERNEL_ASSERT=OFF, \n', 'Python': '3.10.14 (main, Mar 21 2024, 16:24:04) [GCC 11.2.0]', 'TorchVision': '0.17.1', 'numpy_random_seed': 2147483648, 'opencompass': '0.2.5+b272803', 'sys.platform': 'linux'}

Reproduces the problem - code/configuration sample

from mmengine.config import read_base

with read_base():

from ..datasets.civilcomments.civilcomments_clp_a3c5fd import civilcomments_datasets
from ..models.qwen.hf_qwen2_1_5b import models
from ..summarizers.medium import summarizer

datasets = [ *civilcomments_datasets, ]

work_dir = "outputs/models_qwen/qwen2-1_5_b"

Reproduces the problem - command or script

python run.py configs/models_eval/eval_qwen2_1_5b.py --debug

Reproduces the problem - error message

2024-08-12 11:24:24.659675: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2024-08-12 11:24:24.662386: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used. 2024-08-12 11:24:24.690188: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-08-12 11:24:25.290460: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 08/12 11:24:26 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored. 08/12 11:24:26 - OpenCompass - DEBUG - Modules of opencompass's partitioner registry have been automatically imported from opencompass.partitioners 08/12 11:24:26 - OpenCompass - DEBUG - Get class NumWorkerPartitioner from "partitioner" registry in "opencompass" 08/12 11:24:26 - OpenCompass - DEBUG - An NumWorkerPartitioner instance is built from registry, and its implementation can be found in opencompass.partitioners.num_worker 08/12 11:24:26 - OpenCompass - DEBUG - Key eval.runner.task.judge_cfg not found in config, ignored. 08/12 11:24:26 - OpenCompass - DEBUG - Key eval.runner.task.dump_details not found in config, ignored. 08/12 11:24:26 - OpenCompass - DEBUG - Key eval.given_pred not found in config, ignored. 08/12 11:24:26 - OpenCompass - DEBUG - Additional config: {} 08/12 11:24:26 - OpenCompass - INFO - Partitioned into 1 tasks. 08/12 11:24:26 - OpenCompass - DEBUG - Task 0: [qwen2-1.5b-hf/._data_civil_comments_data] 08/12 11:24:26 - OpenCompass - DEBUG - Modules of opencompass's runner registry have been automatically imported from opencompass.runners 08/12 11:24:26 - OpenCompass - DEBUG - Get class LocalRunner from "runner" registry in "opencompass" 08/12 11:24:26 - OpenCompass - DEBUG - An LocalRunner instance is built from registry, and its implementation can be found in opencompass.runners.local 08/12 11:24:26 - OpenCompass - DEBUG - Modules of opencompass's task registry have been automatically imported from opencompass.tasks 08/12 11:24:26 - OpenCompass - DEBUG - Get class OpenICLInferTask from "task" registry in "opencompass" 08/12 11:24:26 - OpenCompass - DEBUG - An OpenICLInferTask instance is built from registry, and its implementation can be found in opencompass.tasks.openicl_infer Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library. Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it. 08/12 11:24:28 - OpenCompass - INFO - Task [qwen2-1.5b-hf/._data_civil_comments_data] 2024-08-12 11:24:29.141128: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2024-08-12 11:24:29.143881: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used. 2024-08-12 11:24:29.171594: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-08-12 11:24:29.655916: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 08/12 11:24:35 - OpenCompass - INFO - Start inferencing [qwen2-1.5b-hf/._data_civil_comments_data] [2024-08-12 11:24:35,389] [opencompass.openicl.icl_inferencer.icl_clp_inferencer] [INFO] Calculating conditional log probability for prompts. 0%| | 0/3 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/amov/Documents/LLM/eval/platform/opencompass/opencompass/tasks/openicl_infer.py", line 162, in inferencer.run() File "/home/amov/Documents/LLM/eval/platform/opencompass/opencompass/tasks/openicl_infer.py", line 90, in run self._inference() # File "/home/amov/Documents/LLM/eval/platform/opencompass/opencompass/tasks/openicl_infer.py", line 135, in _inference inferencer.inference(retriever, File "/home/amov/Documents/LLM/eval/platform/opencompass/opencompass/openicl/icl_inferencer/icl_clp_inferencer.py", line 217, in inference self._get_cond_prob([prompt], [position], File "/home/amov/Documents/LLM/eval/platform/opencompass/opencompass/openicl/icl_inferencer/icl_clp_inferencer.py", line 255, in _get_cond_prob get_logits = self.model.get_logits AttributeError: 'HuggingFaceBaseModel' object has no attribute 'get_logits' [2024-08-12 11:24:38,094] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 19294) of binary: /home/amov/anaconda3/envs/qwen/bin/python Traceback (most recent call last): File "/home/amov/anaconda3/envs/qwen/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.2.1', 'console_scripts', 'torchrun')()) File "/home/amov/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 347, in wrapper return f(*args, **kwargs) File "/home/amov/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/distributed/run.py", line 812, in main run(args) File "/home/amov/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/distributed/run.py", line 803, in run elastic_launch( File "/home/amov/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 135, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/home/amov/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 268, in launch_agent raise ChildFailedError(

torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

=======================================================================

/home/amov/Documents/LLM/eval/platform/opencompass/opencompass/tasks/openicl_infer.py FAILED

Failures:

<NO_OTHER_FAILURES>

Root Cause (first observed failure): [0]: time : 2024-08-12_11:24:38 host : amov rank : 0 (local_rank: 0) exitcode : 1 (pid: 19294) error_file: <N/A>

traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

=========================================================================

08/12 11:24:38 - OpenCompass - DEBUG - Get class NaivePartitioner from "partitioner" registry in "opencompass" 08/12 11:24:38 - OpenCompass - DEBUG - An NaivePartitioner instance is built from registry, and its implementation can be found in opencompass.partitioners.naive 08/12 11:24:38 - OpenCompass - DEBUG - Key eval.runner.task.judge_cfg not found in config, ignored. 08/12 11:24:38 - OpenCompass - DEBUG - Key eval.runner.task.dump_details not found in config, ignored. 08/12 11:24:38 - OpenCompass - DEBUG - Key eval.given_pred not found in config, ignored. 08/12 11:24:38 - OpenCompass - DEBUG - Additional config: {'eval': {'runner': {'task': {}}}} 08/12 11:24:38 - OpenCompass - INFO - Partitioned into 1 tasks. 08/12 11:24:38 - OpenCompass - DEBUG - Task 0: [qwen2-1.5b-hf/._data_civil_comments_data] 08/12 11:24:38 - OpenCompass - DEBUG - Get class LocalRunner from "runner" registry in "opencompass" 08/12 11:24:38 - OpenCompass - DEBUG - An LocalRunner instance is built from registry, and its implementation can be found in opencompass.runners.local 08/12 11:24:38 - OpenCompass - DEBUG - Get class OpenICLEvalTask from "task" registry in "opencompass" 08/12 11:24:38 - OpenCompass - DEBUG - An OpenICLEvalTask instance is built from registry, and its implementation can be found in opencompass.tasks.openicl_eval 08/12 11:24:39 - OpenCompass - DEBUG - Modules of opencompass's load_dataset registry have been automatically imported from opencompass.datasets 08/12 11:24:39 - OpenCompass - DEBUG - Get class CivilCommentsDataset from "load_dataset" registry in "opencompass" 08/12 11:24:39 - OpenCompass - DEBUG - An CivilCommentsDataset instance is built from registry, and its implementation can be found in opencompass.datasets.civilcomments 08/12 11:24:39 - OpenCompass - ERROR - /home/amov/Documents/LLM/eval/platform/opencompass/opencompass/tasks/openicl_eval.py - _score - 243 - Task [qwen2-1.5b-hf/._data_civil_comments_data]: No predictions found. 08/12 11:24:39 - OpenCompass - DEBUG - An DefaultSummarizer instance is built from registry, and its implementation can be found in opencompass.summarizers.default 08/12 11:24:39 - OpenCompass - WARNING - unknown inferencer: opencompass.openicl.icl_inferencer.CLPInferencer - ._data_civil_comments_data

Other information

why doesn't the civilcomments dataset support API evaluation?

Aug 12 '24 04:08 cdp-study

opencompass opencompass copied to clipboard

[Bug] I have a question: why doesn't the civilcomments dataset support API evaluation?

Prerequisite

Type

Environment

Reproduces the problem - code/configuration sample

Reproduces the problem - command or script

Reproduces the problem - error message

Other information

opencompass
opencompass copied to clipboard