Qwen-7B TypeError: qwen_attention_forward() got an unexpected keyword argument 'registered_causal_mask'
Model is based on Qwen 1.0, it once worked, but with latest ipex-llm ipex-llm 2.1.0b20240521 Follow below guide to install. https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen#1-install
It reports issue with an unexpected keyword argument 'registered_causal_mask', the same code worked Qwen-7B-Chat
python generate_ipexllm.py
C:\Users\Intel\miniconda3\envs\qwen\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
2024-05-22 21:34:06,278 - INFO - intel_extension_for_pytorch auto imported
2024-05-22 21:34:06,330 - WARNING - Warning: please make sure that you are using the latest codes and checkpoints, especially if you used Qwen-7B before 09.25.2023.请使用最新模型和代码,尤其如果你在9月25日前已经开始使用Qwen-7B,千万注意不要使用错误代码和模型。
2024-05-22 21:34:06,330 - WARNING - Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary
2024-05-22 21:34:06,330 - WARNING - Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm
2024-05-22 21:34:06,331 - WARNING - Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention
2024-05-22 21:34:06,720 - INFO - Converting the current model to sym_int4 format......
Traceback (most recent call last):
File "C:\multi-modality\cvte_qwen\ultra_test_code_and_data\benchmark_test2intel\generate_ipexllm.py", line 71, in
Sorry that I can not reproduce this issue on qwen-7b-chat
(changmin-llm) arda@arda-arc13:~/changmin/llm.cpp$ pip install ipex-llm==2.1.0b20240521
Collecting ipex-llm==2.1.0b20240521
Using cached ipex_llm-2.1.0b20240521-py3-none-manylinux2010_x86_64.whl.metadata (5.0 kB)
Using cached ipex_llm-2.1.0b20240521-py3-none-manylinux2010_x86_64.whl (13.8 MB)
Installing collected packages: ipex-llm
Attempting uninstall: ipex-llm
Found existing installation: ipex-llm 2.1.0b20240522
Uninstalling ipex-llm-2.1.0b20240522:
Successfully uninstalled ipex-llm-2.1.0b20240522
Successfully installed ipex-llm-2.1.0b20240521
(changmin-llm) arda@arda-arc13:~/changmin/llm.cpp$ python qwen.py
/home/arda/miniforge3/envs/changmin-llm/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
warn(
2024-05-23 09:36:35,438 - INFO - intel_extension_for_pytorch auto imported
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 22.53it/s]
2024-05-23 09:36:35,965 - INFO - Converting the current model to sym_int4 format......
-------------------- Prompt --------------------
<|im_start|>system
You are a helpful assistant.
<|im_end|>
<|im_start|>user
AI是什么?
<|im_end|>
<|im_start|>assistant
-------------------- Output --------------------
system
You are a helpful assistant.
user
AI是什么?
assistant
AI是人工智能的缩写,它是指模拟人类智能的技术和方法。它是研究如何让计算机像人一样思考、学习、理解和处理信息的
fixed in https://github.com/intel-analytics/ipex-llm/pull/11110