opencompass [Bug] RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] The bug has not been fixed in the latest version.

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

{'CUDA available': True,
 'CUDA_HOME': '/usr/local/cuda',
 'GCC': 'gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609',
 'GPU 0,1,2,3': 'NVIDIA RTX A6000',
 'GPU 4,5,6,7': 'NVIDIA GeForce RTX 2080 Ti',
 'MMEngine': '0.10.2',
 'NVCC': 'Cuda compilation tools, release 10.0, V10.0.13',
 'OpenCV': '4.8.1',
 'PyTorch': '1.13.1+cu117',
 'PyTorch compiling details': 'PyTorch built with:\n'
                              '  - GCC 9.3\n'
                              '  - C++ Version: 201402\n'
                              '  - Intel(R) Math Kernel Library Version '
                              '2020.0.0 Product Build 20191122 for Intel(R) 64 '
                              'architecture applications\n'
                              '  - Intel(R) MKL-DNN v2.6.0 (Git Hash '
                              '52b5f107dd9cf10910aaa19cb47f3abf9b349815)\n'
                              '  - OpenMP 201511 (a.k.a. OpenMP 4.5)\n'
                              '  - LAPACK is enabled (usually provided by '
                              'MKL)\n'
                              '  - NNPACK is enabled\n'
                              '  - CPU capability usage: AVX2\n'
                              '  - CUDA Runtime 11.7\n'
                              '  - NVCC architecture flags: '
                              '-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_
50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-ge
ncode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=comput
e_86,code=sm_86\n'
                              '  - CuDNN 8.5\n'
                              '  - Magma 2.6.1\n'
                              '  - Build settings: BLAS_INFO=mkl, '
                              'BUILD_TYPE=Release, CUDA_VERSION=11.7, '
                              'CUDNN_VERSION=8.5.0, '
                              'CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, '
                              'CXX_FLAGS= -fabi-version=11 -Wno-deprecated '
                              '-fvisibility-inlines-hidden -DUSE_PTHREADPOOL '
                              '-fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM '
                              '-DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK '
                              '-DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE '
                              '-DEDGE_PROFILER_USE_KINETO -O2 -fPIC '
                              '-Wno-narrowing -Wall -Wextra '
                              '-Werror=return-type -Werror=non-virtual-dtor '
                              '-Wno-missing-field-initializers '
                              '-Wno-type-limits -Wno-array-bounds '
                              '-Wno-unknown-pragmas -Wunused-local-typedefs '
                              '-Wno-unused-parameter -Wno-unused-function '
                              '-Wno-unused-result -Wno-strict-overflow '
                              '-Wno-strict-aliasing '
                              '-Wno-error=deprecated-declarations '
                              '-Wno-stringop-overflow -Wno-psabi '
                              '-Wno-error=pedantic -Wno-error=redundant-decls '
                              '-Wno-error=old-style-cast '
                              '-fdiagnostics-color=always -faligned-new '
                              '-Wno-unused-but-set-variable '
                              '-Wno-maybe-uninitialized -fno-math-errno '
                              '-fno-trapping-math -Werror=format '
                              '-Werror=cast-function-type '
                              '-Wno-stringop-overflow, LAPACK_INFO=mkl, '
                              'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, '
                              'PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, '
                              'USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, '
                              'USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, '
                              'USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, '
                              'USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n',
 'Python': '3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]',
 'TorchVision': '0.14.1+cu117',
 'numpy_random_seed': 2147483648,
 'opencompass': '0.2.0+',
 'sys.platform': 'linux'}

Reproduces the problem - code/configuration sample

model_path="./opt125m/"
python run.py --datasets siqa_gen \
		--hf-path ${model_path} \
		--tokenizer-path ${model_path} \
		--model-kwargs trust_remote_code=True \
		--tokenizer-kwargs trust_remote_code=True \
		--max-out-len 100 \
		--max-seq-len 2048 \
		--batch-size 8 \
		--no-batch-padding \
		--num-gpus 1

Reproduces the problem - command or script

model_path="./opt125m/"
python run.py --datasets siqa_gen \
		--hf-path ${model_path} \
		--tokenizer-path ${model_path} \
		--model-kwargs trust_remote_code=True \
		--tokenizer-kwargs trust_remote_code=True \
		--max-out-len 100 \
		--max-seq-len 2048 \
		--batch-size 8 \
		--no-batch-padding \
		--num-gpus 1

Reproduces the problem - error message

12/30 17:30:51 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_hugginface_opt125m/siqa]
12/30 17:30:54 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_hugginface_opt125m/siqa]
[2023-12-30 17:30:55,647] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...

  0%|          | 0/245 [00:00<?, ?it/s]
  0%|          | 0/245 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/data1/tangyh/opencompass/opencompass/tasks/openicl_infer.py", line 148, in <module>
    inferencer.run()
  File "/data1/tangyh/opencompass/opencompass/tasks/openicl_infer.py", line 78, in run
    self._inference()
  File "/data1/tangyh/opencompass/opencompass/tasks/openicl_infer.py", line 121, in _inference
    inferencer.inference(retriever,
  File "/data1/tangyh/opencompass/opencompass/openicl/icl_inferencer/icl_gen_inferencer.py", line 140, in inference
    results = self.model.generate_from_template(
  File "/data1/tangyh/opencompass/opencompass/models/base.py", line 141, in generate_from_template
    return self.generate(inputs, max_out_len=max_out_len, **kwargs)
  File "/data1/tangyh/opencompass/opencompass/models/huggingface.py", line 248, in generate
    return sum(
  File "/data1/tangyh/opencompass/opencompass/models/huggingface.py", line 249, in <genexpr>
    (self._single_generate(inputs=[input_],
  File "/data1/tangyh/opencompass/opencompass/models/huggingface.py", line 376, in _single_generate
    outputs = self.model.generate(input_ids=input_ids,
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/transformers/generation/utils.py", line 1718, in generate
    return self.greedy_search(
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/transformers/generation/utils.py", line 2579, in greedy_search
    outputs = self(
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 1143, in forward
    outputs = self.model.decoder(
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 909, in forward
    layer_outputs = decoder_layer(
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 547, in forward
    hidden_states = self.self_attn_layer_norm(hidden_states)
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 190, in forward
    return F.layer_norm(
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/nn/functional.py", line 2515, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 25937) of binary: /data1/tangyh/.envs/opencompass/bin/python
Traceback (most recent call last):
  File "/data1/tangyh/.envs/opencompass/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/distributed/run.py", line 762, in main
    run(args)
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/distributed/run.py", line 753, in run
    elastic_launch(
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/data1/tangyh/.envs/opencompass/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
/data1/tangyh/opencompass/opencompass/tasks/openicl_infer.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-12-30_17:30:59
  host      : 2080ti-2
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 25937)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

Other information

I have tried with different torch versions (1.13 and 2.10) and different GPUs(2080ti and 3090ti), but the same error is raised;
torch.cuda.is_available consistently returns True;
I try huggingface transformers and torch outside opencompass and they work well with opt125m. So I think the problem is caused by opencompass instead of my environment;
My machine can not connect to huggingface so I use the local huggingface model (opt125m) I downloaded preciously. It works fine outside of opencompass, just like the previous line says.
I just downloaded and installed opencompass today so it is definitely the latest version.

I saw a similar bug reported in issue #630 and tried with the advice provided in the issue but that does NOT help.

I have been stuck by this issue for two days and any help will be appreciated.

Dec 30 '23 09:12 yuanhangtangle

I solve this problem just now. device_map should be set as auto otherwise the inference would be started on CPU. However, the 'half' precision is only implemented for GPU. Therefore, the following shell script works:

model_path="./opt125m/"
python run.py --datasets siqa_gen \
		--hf-path ${model_path} \
		--tokenizer-path ${model_path} \
		--model-kwargs device_map='auto' trust_remote_code=True \
		--tokenizer-kwargs trust_remote_code=True \
		--max-out-len 100 \
		--max-seq-len 2048 \
		--batch-size 8 \
		--no-batch-padding \
		--num-gpus 1

Jan 06 '24 05:01 yuanhangtangle

Thanks for your solution, feel free to reopen it if needed

Apr 28 '24 16:04 bittersweet1999

opencompass opencompass copied to clipboard

[Bug] RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

Prerequisite

Type

Environment

Reproduces the problem - code/configuration sample

Reproduces the problem - command or script

Reproduces the problem - error message

Other information

opencompass
opencompass copied to clipboard