ms-swift icon indicating copy to clipboard operation
ms-swift copied to clipboard

NPU qwen2模型推理报错

Open JiayuQiao opened this issue 1 year ago • 2 comments

报错描述

使用swift infer命令,do_sample=True时报错,do_sample=False时可以推理但生成结果乱码

环境

  • NPU:昇腾910B3
  • Python:3.9.18
  • ms-swift:2.4.0.post1
  • torch-npu:2.1.0
  • Transformers:4.37.2

推理模型

Qwen2-7B-Instruct

报错内容

EZ9999: Inner Error! EZ9999 Kernel task happen error, retCode=0x2a, [aicpu exception].[FUNC:PreCheckTaskErr][FILE:task_info.cc][LINE:1677] TraceBack (most recent call last): AICPU Kernel task happen error, retCode=0x2a.[FUNC:GetError][FILE:stream.cc][LINE:1454] Aicpu kernel execute failed, device_id=0, stream_id=28, task_id=1726, errorCode=2a.[FUNC:PrintAicpuErrorInfo][FILE:task_info.cc][LINE:1522] Aicpu kernel execute failed, device_id=0, stream_id=28, task_id=1726, fault op_name=[FUNC:GetError][FILE:stream.cc][LINE:1454] rtStreamSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:50] synchronize stream failed, runtime result = 507018[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]

Exception in thread Thread-6: Traceback (most recent call last): File "/mnt/dsep/python/venv/lib/python3.9/threading.py", line 980, in _bootstrap_inner self.run() File "/mnt/dsep/python/venv/lib/python3.9/threading.py", line 917, in run self._target(*self._args, **self._kwargs) File "/mnt/dsep/python/venv/lib/python3.9/site-packages/swift/llm/utils/utils.py", line 694, in _model_generate return model.generate(*args, **kwargs) File "/mnt/dsep/python/venv/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/mnt/dsep/python/venv/lib/python3.9/site-packages/transformers/generation/utils.py", line 1525, in generate return self.sample( File "/mnt/dsep/python/venv/lib/python3.9/site-packages/transformers/generation/utils.py", line 2669, in sample streamer.put(next_tokens.cpu()) RuntimeError: ACL stream synchronize failed, error code:507018

JiayuQiao avatar Sep 05 '24 10:09 JiayuQiao

又测试了一下,Qwen2-Instruct系列只有0.5B模型能正常推理,其他模型都不可以,报错内容和7B模型相同。

JiayuQiao avatar Sep 06 '24 02:09 JiayuQiao

你好,请问你的swift infer和swift deploy可以用多卡吗?我没找到设置NPU多卡的参数

klaus-duan avatar Sep 23 '24 07:09 klaus-duan

又测试了一下,Qwen2-Instruct系列只有0.5B模型能正常推理,其他模型都不可以,报错内容和7B模型相同。

我测了0.5B和3B可以,7B会E39999

JackWu2671 avatar Nov 11 '24 02:11 JackWu2671

请使用transformers的推理代码试试可不可以。如果不行的话, ms-swift应该也是推理不了的

Jintao-Huang avatar Nov 11 '24 03:11 Jintao-Huang