opencompass icon indicating copy to clipboard operation
opencompass copied to clipboard

[Bug] 根据官方的Quick Start运行,输出结果为空

Open MansonHua opened this issue 9 months ago • 17 comments

Prerequisite

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

{'CUDA available': True, 'CUDA_HOME': None, 'GCC': 'gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0', 'GPU 0': 'NVIDIA GeForce RTX 3090', 'MMEngine': '0.10.4', 'MUSA available': False, 'OpenCV': '4.9.0', 'PyTorch': '2.3.0', 'PyTorch compiling details': 'PyTorch built with:\n' ' - GCC 9.3\n' ' - C++ Version: 201703\n' ' - Intel(R) oneAPI Math Kernel Library Version ' '2023.1-Product Build 20230303 for Intel(R) 64 ' 'architecture applications\n' ' - Intel(R) MKL-DNN v3.3.6 (Git Hash ' '86e6af5974177e513fd3fee58425e1063e7f1361)\n' ' - OpenMP 201511 (a.k.a. OpenMP 4.5)\n' ' - LAPACK is enabled (usually provided by ' 'MKL)\n' ' - NNPACK is enabled\n' ' - CPU capability usage: AVX512\n' ' - CUDA Runtime 12.1\n' ' - NVCC architecture flags: ' '-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90\n' ' - CuDNN 8.9.2\n' ' - Magma 2.6.1\n' ' - Build settings: BLAS_INFO=mkl, ' 'BUILD_TYPE=Release, CUDA_VERSION=12.1, ' 'CUDNN_VERSION=8.9.2, ' 'CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, ' 'CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 ' '-fabi-version=11 -fvisibility-inlines-hidden ' '-DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO ' '-DLIBKINETO_NOROCTRACER -DUSE_FBGEMM ' '-DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK ' '-DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE ' '-O2 -fPIC -Wall -Wextra -Werror=return-type ' '-Werror=non-virtual-dtor -Werror=bool-operation ' '-Wnarrowing -Wno-missing-field-initializers ' '-Wno-type-limits -Wno-array-bounds ' '-Wno-unknown-pragmas -Wno-unused-parameter ' '-Wno-unused-function -Wno-unused-result ' '-Wno-strict-overflow -Wno-strict-aliasing ' '-Wno-stringop-overflow -Wsuggest-override ' '-Wno-psabi -Wno-error=pedantic ' '-Wno-error=old-style-cast -Wno-missing-braces ' '-fdiagnostics-color=always -faligned-new ' '-Wno-unused-but-set-variable ' '-Wno-maybe-uninitialized -fno-math-errno ' '-fno-trapping-math -Werror=format ' '-Wno-stringop-overflow, LAPACK_INFO=mkl, ' 'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, ' 'PERF_WITH_AVX512=1, TORCH_VERSION=2.3.0, ' 'USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, ' 'USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, ' 'USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, ' 'USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, ' 'USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, ' 'USE_ROCM_KERNEL_ASSERT=OFF, \n', 'Python': '3.10.14 (main, May 6 2024, 19:42:50) [GCC 11.2.0]', 'TorchVision': '0.18.0', 'numpy_random_seed': 2147483648, 'opencompass': '0.2.4+e3c0448', 'sys.platform': 'linux'}

Reproduces the problem - code/configuration sample

运行quick start中的示例 python run.py --models hf_opt_125m hf_opt_350m --datasets siqa_gen winograd_ppl

Reproduces the problem - command or script

见上一条

Reproduces the problem - error message

运行时的输出:

(opencompass) aidt@aidt-System-Product-Name:~/PycharmProjects/opencompass$ python run.py --models hf_opt_125m hf_opt_350m --datasets siqa_gen winograd_ppl 05/16 15:14:31 - OpenCompass - INFO - Loading siqa_gen: configs/datasets/siqa/siqa_gen.py 05/16 15:14:31 - OpenCompass - INFO - Loading winograd_ppl: configs/datasets/winograd/winograd_ppl.py 05/16 15:14:31 - OpenCompass - INFO - Loading hf_opt_125m: configs/models/opt/hf_opt_125m.py 05/16 15:14:31 - OpenCompass - INFO - Loading hf_opt_350m: configs/models/opt/hf_opt_350m.py 05/16 15:14:31 - OpenCompass - INFO - Loading example: configs/summarizers/example.py 05/16 15:14:31 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored. 05/16 15:14:31 - OpenCompass - INFO - Partitioned into 2 tasks. launch OpenICLInfer[opt-125m-hf/siqa,opt-125m-hf/winograd] on GPU 0
0%| | 0/2 [00:00<?, ?it/s]05/16 15:14:32 - OpenCompass - ERROR - /home/aidt/PycharmProjects/opencompass/opencompass/runners/local.py - _launch - 208 - task OpenICLInfer[opt-125m-hf/siqa,opt-125m-hf/winograd] fail, see outputs/default/20240516_151431/logs/infer/opt-125m-hf/siqa.out launch OpenICLInfer[opt-350m-hf/siqa,opt-350m-hf/winograd] on GPU 0
50%|██████████████████████▌ | 1/2 [00:01<00:01, 1.24s/it]05/16 15:14:33 - OpenCompass - ERROR - /home/aidt/PycharmProjects/opencompass/opencompass/runners/local.py - _launch - 208 - task OpenICLInfer[opt-350m-hf/siqa,opt-350m-hf/winograd] fail, see outputs/default/20240516_151431/logs/infer/opt-350m-hf/siqa.out 100%|█████████████████████████████████████████████| 2/2 [00:02<00:00, 1.20s/it] 05/16 15:14:33 - OpenCompass - ERROR - /home/aidt/PycharmProjects/opencompass/opencompass/runners/base.py - summarize - 64 - OpenICLInfer[opt-125m-hf/siqa,opt-125m-hf/winograd] failed with code 1 05/16 15:14:33 - OpenCompass - ERROR - /home/aidt/PycharmProjects/opencompass/opencompass/runners/base.py - summarize - 64 - OpenICLInfer[opt-350m-hf/siqa,opt-350m-hf/winograd] failed with code 1 05/16 15:14:33 - OpenCompass - INFO - Partitioned into 4 tasks. launch OpenICLEval[opt-125m-hf/siqa] on CPU
launch OpenICLEval[opt-125m-hf/winograd] on CPU
launch OpenICLEval[opt-350m-hf/siqa] on CPU
launch OpenICLEval[opt-350m-hf/winograd] on CPU
100%|█████████████████████████████████████████████| 4/4 [00:25<00:00, 6.36s/it] dataset version metric mode opt-125m-hf opt-350m-hf


siqa - - - - - winograd - - - - - 05/16 15:14:59 - OpenCompass - INFO - write summary to /home/aidt/PycharmProjects/opencompass/outputs/default/20240516_151431/summary/summary_20240516_151431.txt 05/16 15:14:59 - OpenCompass - INFO - write csv to /home/aidt/PycharmProjects/opencompass/outputs/default/20240516_151431/summary/summary_20240516_151431.csv 检查siqa.out,内容是; Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library. Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.

尝试设置环境变量: export MKL_SERVICE_FORCE_INTEL=1 重新运行: python run.py --models hf_opt_125m hf_opt_350m --datasets siqa_gen winograd_ppl 就一直卡在: (opencompass) aidt@aidt-System-Product-Name:~/PycharmProjects/opencompass$ python run.py --models hf_opt_125m hf_opt_350m --datasets siqa_gen winograd_ppl 05/16 15:22:14 - OpenCompass - INFO - Loading siqa_gen: configs/datasets/siqa/siqa_gen.py 05/16 15:22:14 - OpenCompass - INFO - Loading winograd_ppl: configs/datasets/winograd/winograd_ppl.py 05/16 15:22:14 - OpenCompass - INFO - Loading hf_opt_125m: configs/models/opt/hf_opt_125m.py 05/16 15:22:14 - OpenCompass - INFO - Loading hf_opt_350m: configs/models/opt/hf_opt_350m.py 05/16 15:22:14 - OpenCompass - INFO - Loading example: configs/summarizers/example.py 05/16 15:22:14 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored. 05/16 15:22:14 - OpenCompass - INFO - Partitioned into 2 tasks. launch OpenICLInfer[opt-125m-hf/siqa,opt-125m-hf/winograd] on GPU 0
0%| | 0/2 [00:00<?, ?it/s]

Other information

No response

MansonHua avatar May 16 '24 07:05 MansonHua

Please add --debug for more information

tonysy avatar May 17 '24 02:05 tonysy

(opencompass) aidt@aidt-System-Product-Name:~/PycharmProjects/opencompass$ python run.py --models hf_opt_125m hf_opt_350m --datasets siqa_gen winograd_ppl --debug 05/17 11:02:26 - OpenCompass - INFO - Loading siqa_gen: configs/datasets/siqa/siqa_gen.py 05/17 11:02:26 - OpenCompass - INFO - Loading winograd_ppl: configs/datasets/winograd/winograd_ppl.py 05/17 11:02:26 - OpenCompass - INFO - Loading hf_opt_125m: configs/models/opt/hf_opt_125m.py 05/17 11:02:26 - OpenCompass - INFO - Loading hf_opt_350m: configs/models/opt/hf_opt_350m.py 05/17 11:02:26 - OpenCompass - INFO - Loading example: configs/summarizers/example.py 05/17 11:02:26 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored. 05/17 11:02:26 - OpenCompass - DEBUG - Modules of opencompass's partitioner registry have been automatically imported from opencompass.partitioners 05/17 11:02:26 - OpenCompass - DEBUG - Get class NumWorkerPartitioner from "partitioner" registry in "opencompass" 05/17 11:02:26 - OpenCompass - DEBUG - An NumWorkerPartitioner instance is built from registry, and its implementation can be found in opencompass.partitioners.num_worker 05/17 11:02:26 - OpenCompass - DEBUG - Key eval.runner.task.judge_cfg not found in config, ignored. 05/17 11:02:26 - OpenCompass - DEBUG - Key eval.runner.task.dump_details not found in config, ignored. 05/17 11:02:26 - OpenCompass - DEBUG - Key eval.given_pred not found in config, ignored. 05/17 11:02:26 - OpenCompass - DEBUG - Additional config: {} 05/17 11:02:26 - OpenCompass - INFO - Partitioned into 2 tasks. 05/17 11:02:26 - OpenCompass - DEBUG - Task 0: [opt-125m-hf/siqa,opt-125m-hf/winograd] 05/17 11:02:26 - OpenCompass - DEBUG - Task 1: [opt-350m-hf/siqa,opt-350m-hf/winograd] 05/17 11:02:26 - OpenCompass - DEBUG - Modules of opencompass's runner registry have been automatically imported from opencompass.runners 05/17 11:02:26 - OpenCompass - DEBUG - Get class LocalRunner from "runner" registry in "opencompass" 05/17 11:02:26 - OpenCompass - DEBUG - An LocalRunner instance is built from registry, and its implementation can be found in opencompass.runners.local 05/17 11:02:26 - OpenCompass - DEBUG - Modules of opencompass's task registry have been automatically imported from opencompass.tasks 05/17 11:02:26 - OpenCompass - DEBUG - Get class OpenICLInferTask from "task" registry in "opencompass" 05/17 11:02:26 - OpenCompass - DEBUG - An OpenICLInferTask instance is built from registry, and its implementation can be found in opencompass.tasks.openicl_infer Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library. Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it. 05/17 11:02:27 - OpenCompass - DEBUG - Get class OpenICLInferTask from "task" registry in "opencompass" 05/17 11:02:27 - OpenCompass - DEBUG - An OpenICLInferTask instance is built from registry, and its implementation can be found in opencompass.tasks.openicl_infer Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library. Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it. 05/17 11:02:28 - OpenCompass - DEBUG - Get class NaivePartitioner from "partitioner" registry in "opencompass" 05/17 11:02:28 - OpenCompass - DEBUG - An NaivePartitioner instance is built from registry, and its implementation can be found in opencompass.partitioners.naive 05/17 11:02:28 - OpenCompass - DEBUG - Key eval.runner.task.judge_cfg not found in config, ignored. 05/17 11:02:28 - OpenCompass - DEBUG - Key eval.runner.task.dump_details not found in config, ignored. 05/17 11:02:28 - OpenCompass - DEBUG - Key eval.given_pred not found in config, ignored. 05/17 11:02:28 - OpenCompass - DEBUG - Additional config: {'eval': {'runner': {'task': {}}}} 05/17 11:02:28 - OpenCompass - INFO - Partitioned into 4 tasks. 05/17 11:02:28 - OpenCompass - DEBUG - Task 0: [opt-125m-hf/siqa] 05/17 11:02:28 - OpenCompass - DEBUG - Task 1: [opt-125m-hf/winograd] 05/17 11:02:28 - OpenCompass - DEBUG - Task 2: [opt-350m-hf/siqa] 05/17 11:02:28 - OpenCompass - DEBUG - Task 3: [opt-350m-hf/winograd] 05/17 11:02:28 - OpenCompass - DEBUG - Get class LocalRunner from "runner" registry in "opencompass" 05/17 11:02:28 - OpenCompass - DEBUG - An LocalRunner instance is built from registry, and its implementation can be found in opencompass.runners.local 05/17 11:02:28 - OpenCompass - DEBUG - Get class OpenICLEvalTask from "task" registry in "opencompass" 05/17 11:02:28 - OpenCompass - DEBUG - An OpenICLEvalTask instance is built from registry, and its implementation can be found in opencompass.tasks.openicl_eval 05/17 11:02:29 - OpenCompass - DEBUG - Modules of opencompass's load_dataset registry have been automatically imported from opencompass.datasets 05/17 11:02:29 - OpenCompass - DEBUG - Get class siqaDataset_V2 from "load_dataset" registry in "opencompass" 05/17 11:02:30 - OpenCompass - DEBUG - An siqaDataset_V2 instance is built from registry, and its implementation can be found in opencompass.datasets.siqa 05/17 11:02:30 - OpenCompass - ERROR - /home/aidt/PycharmProjects/opencompass/opencompass/tasks/openicl_eval.py - _score - 241 - Task [opt-125m-hf/siqa]: No predictions found. 05/17 11:02:30 - OpenCompass - DEBUG - Get class OpenICLEvalTask from "task" registry in "opencompass" 05/17 11:02:30 - OpenCompass - DEBUG - An OpenICLEvalTask instance is built from registry, and its implementation can be found in opencompass.tasks.openicl_eval 05/17 11:02:30 - OpenCompass - DEBUG - Get class winogradDataset from "load_dataset" registry in "opencompass" 05/17 11:02:39 - OpenCompass - DEBUG - An winogradDataset instance is built from registry, and its implementation can be found in opencompass.datasets.winograd 05/17 11:02:39 - OpenCompass - ERROR - /home/aidt/PycharmProjects/opencompass/opencompass/tasks/openicl_eval.py - _score - 241 - Task [opt-125m-hf/winograd]: No predictions found. 05/17 11:02:39 - OpenCompass - DEBUG - Get class OpenICLEvalTask from "task" registry in "opencompass" 05/17 11:02:39 - OpenCompass - DEBUG - An OpenICLEvalTask instance is built from registry, and its implementation can be found in opencompass.tasks.openicl_eval 05/17 11:02:40 - OpenCompass - DEBUG - Get class siqaDataset_V2 from "load_dataset" registry in "opencompass" 05/17 11:02:40 - OpenCompass - DEBUG - An siqaDataset_V2 instance is built from registry, and its implementation can be found in opencompass.datasets.siqa 05/17 11:02:40 - OpenCompass - ERROR - /home/aidt/PycharmProjects/opencompass/opencompass/tasks/openicl_eval.py - _score - 241 - Task [opt-350m-hf/siqa]: No predictions found. 05/17 11:02:40 - OpenCompass - DEBUG - Get class OpenICLEvalTask from "task" registry in "opencompass" 05/17 11:02:40 - OpenCompass - DEBUG - An OpenICLEvalTask instance is built from registry, and its implementation can be found in opencompass.tasks.openicl_eval 05/17 11:02:41 - OpenCompass - DEBUG - Get class winogradDataset from "load_dataset" registry in "opencompass" 05/17 11:02:48 - OpenCompass - DEBUG - An winogradDataset instance is built from registry, and its implementation can be found in opencompass.datasets.winograd 05/17 11:02:48 - OpenCompass - ERROR - /home/aidt/PycharmProjects/opencompass/opencompass/tasks/openicl_eval.py - _score - 241 - Task [opt-350m-hf/winograd]: No predictions found. 05/17 11:02:48 - OpenCompass - DEBUG - An DefaultSummarizer instance is built from registry, and its implementation can be found in opencompass.summarizers.default dataset version metric mode opt-125m-hf opt-350m-hf


siqa - - - - - winograd - - - - - 05/17 11:02:48 - OpenCompass - INFO - write summary to /home/aidt/PycharmProjects/opencompass/outputs/default/20240517_110226/summary/summary_20240517_110226.txt 05/17 11:02:48 - OpenCompass - INFO - write csv to /home/aidt/PycharmProjects/opencompass/outputs/default/20240517_110226/summary/summary_20240517_110226.csv

添加--debug后输出结果如上

MansonHua avatar May 17 '24 03:05 MansonHua

我也遇到的同样的问题,--debug中ERROR报错均类似于 06/01 11:30:28 - OpenCompass - ERROR - /root/data1/opencompass/opencompass/tasks/openicl_eval.py - _score - 243 - Task [chatglm2-6b_hf/medbench-Med-Exam]: No predictions found.

kamieee avatar Jun 01 '24 03:06 kamieee

我也遇到了这个问题 06/02 11:59:30 - OpenCompass - ERROR - d:\graduate\deeplearning\1_new_work\LLM\code1\opencompass-main\opencompass\tasks\openicl_eval.py - _score - 244 - Task [opt-125m-hf/winograd]: No predictions found.

XiaozhuLove avatar Jun 02 '24 04:06 XiaozhuLove

我也遇到的同样的问题,--debug中ERROR报错均类似于 06/01 11:30:28 - OpenCompass - ERROR - /root/data1/opencompass/opencompass/tasks/openicl_eval.py - _score - 243 - Task [chatglm2-6b_hf/medbench-Med-Exam]: No predictions found.

请问你解决了吗?

XiaozhuLove avatar Jun 02 '24 06:06 XiaozhuLove

我也遇到的同样的问题,--debug中ERROR报错均类似于 06/01 11:30:28 - OpenCompass - ERROR - /root/data1/opencompass/opencompass/tasks/openicl_eval.py - _score - 243 - Task [chatglm2-6b_hf/medbench-Med-Exam]: No predictions found.

请问你解决了吗?

我使用ChatGLM2-6B(我要评测的模型)官方的evaluate代码,去手动验证每个数据集了

kamieee avatar Jun 02 '24 08:06 kamieee

我也遇到了这个问题。正卡在这了

liangjunbo1994 avatar Jun 03 '24 03:06 liangjunbo1994

我也是遇到了这个问题: 不加MKL_SERVICE_FORCE_INTEL=1,那么输出为空; 加了MKL_SERVICE_FORCE_INTEL=1,那么会一直卡在 [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process... | 0/799 [00:00<?, ?it/s]

WeeeicheN avatar Jun 03 '24 10:06 WeeeicheN

我也遇到的同样的问题,--debug中ERROR报错均类似于 06/01 11:30:28 - OpenCompass - ERROR - /root/data1/opencompass/opencompass/tasks/openicl_eval.py - _score - 243 - Task [chatglm2-6b_hf/medbench-Med-Exam]: No predictions found.

请问你解决了吗?

我也是这个问题,请问如何解决呀?

nanwang-crea avatar Jun 04 '24 07:06 nanwang-crea

看下infer文件夹下的日志是否有报错,如果报错是 MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library 那么需要在run.py文件前设置, import os os.environ["MKL_SERVICE_FORCE_INTEL"] = '1' os.environ["MKL_THREADING_LAYER"] = '1'

设置后如果卡在 [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process... | 0/799 [00:00<?, ?it/s] 那需要看下torch是否可以正常使用GPU,不能使用的话,需要重新安装torch

qianxianyang avatar Jun 05 '24 03:06 qianxianyang

看下infer文件夹下的日志是否有报错,如果报错是 MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library 那么需要在run.py文件前设置, import os os.environ["MKL_SERVICE_FORCE_INTEL"] = '1' os.environ["MKL_THREADING_LAYER"] = '1'

设置后如果卡在 [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process... | 0/799 [00:00<?, ?it/s] 那需要看下torch是否可以正常使用GPU,不能使用的话,需要重新安装torch

正解,我后面在运行参数里加入--debug后也发现了这个问题,估计是正常按README里的方式(cd opencompass; pip install -e)配置的环境里,torch都有问题,需要重新装一下torch

WeeeicheN avatar Jun 12 '24 07:06 WeeeicheN

nice

import os
os.environ["MKL_SERVICE_FORCE_INTEL"] = '1'
os.environ["MKL_THREADING_LAYER"] = '1'

加重新安装以后成功运行quick start了

VastOcean-Yang avatar Jun 14 '24 09:06 VastOcean-Yang

reinstall torch

ljqiff avatar Aug 19 '24 15:08 ljqiff

非常感谢,遇到了同样的问题

AnnyShen55 avatar Aug 19 '24 22:08 AnnyShen55

已收到你的邮件。

ljqiff avatar Aug 19 '24 22:08 ljqiff

python run.py --datasets siqa_gen winograd_ppl
--hf-type base
--hf-path facebook/opt-125m 08/19 16:01:51 - OpenCompass - INFO - Loading siqa_gen: configs/datasets/siqa/siqa_gen.py 08/19 16:01:51 - OpenCompass - INFO - Loading winograd_ppl: configs/datasets/winograd/winograd_ppl.py 08/19 16:01:51 - OpenCompass - INFO - Loading example: configs/summarizers/example.py 08/19 16:01:51 - OpenCompass - INFO - Current exp folder: outputs/default/20240819_160151 08/19 16:01:51 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored. Downloading builder script: 100%|████████████████████████████████████████████████████████████████████████████| 5.42k/5.42k [00:00<00:00, 34.0kB/s] Downloading readme: 100%|████████████████████████████████████████████████████████████████████████████████████| 8.61k/8.61k [00:00<00:00, 32.3kB/s] Downloading data: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 113k/113k [00:00<00:00, 603kB/s] Generating test split: 100%|███████████████████████████████████████████████████████████████████████████| 285/285 [00:00<00:00, 2729.86 examples/s] Map: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 285/285 [00:00<00:00, 3561.77 examples/s] 08/19 16:01:57 - OpenCompass - INFO - Partitioned into 1 tasks. launch OpenICLInfer[opt-125m_hf/siqa,opt-125m_hf/winograd] on GPU 0
0%| | 0/1 [00:00<?, ?it/s] 重装了torch,还是卡在这里,请问有什么办法吗?

AnnyShen55 avatar Aug 19 '24 23:08 AnnyShen55

看下infer文件夹下的日志是否有报错,如果报错是 MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library 那么需要在run.py文件前设置, import os os.environ["MKL_SERVICE_FORCE_INTEL"] = '1' os.environ["MKL_THREADING_LAYER"] = '1'

设置后如果卡在 [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process... | 0/799 [00:00<?, ?it/s] 那需要看下torch是否可以正常使用GPU,不能使用的话,需要重新安装torch

@tonysy @XiaozhuLove @ljqiff @kennymckormick @WeeeicheN I have set two environs in run.py, reinstalled pytoch (2.3), and checked torch.cuda.is_available(), but it still doesn't work, getting stucked in here 0%| | 0/1 [00:00<?, ?it/s]

ppalantir avatar Aug 21 '24 12:08 ppalantir