xtuner transformers == 4.44.2 xtuner == 0.1.23 训练 qwen2 时报错

transformers == 4.44.2 xtuner == 0.1.23 训练 qwen2 时候，使用 pack_to_max_length = False use_varlen_attn = False 会报错。似乎transformers 中没有_flash_attention_forward了。

[rank1]: File "/opt/conda/envs/xtuner-env/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 655, in forward [rank1]: hidden_states, self_attn_weights, present_key_value = self.self_attn( [rank1]: File "/opt/conda/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl [rank1]: return self._call_impl(*args, **kwargs) [rank1]: File "/opt/conda/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1603, in _call_impl [rank1]: result = forward_call(*args, **kwargs) [rank1]: File "/opt/conda/envs/xtuner-env/lib/python3.10/site-packages/xtuner/model/modules/dispatch/qwen2.py", line 160, in qwen2_attn_forward [rank1]: attn_output = self._flash_attention_forward( [rank1]: File "/opt/conda/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1729, in getattr [rank1]: raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'") [rank1]: AttributeError: 'Qwen2FlashAttention2' object has no attribute '_flash_attention_forward'

Sep 07 '24 14:09 thomZ1

transformers == 4.44.2 xtuner == 0.1.23 训练 qwen2 时候，使用 pack_to_max_length = False use_varlen_attn = False 会报错。似乎transformers 中没有_flash_attention_forward了。

[rank1]: File "/opt/conda/envs/xtuner-env/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 655, in forward [rank1]: hidden_states, self_attn_weights, present_key_value = self.self_attn( [rank1]: File "/opt/conda/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl [rank1]: return self._call_impl(*args, **kwargs) [rank1]: File "/opt/conda/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1603, in _call_impl [rank1]: result = forward_call(*args, **kwargs) [rank1]: File "/opt/conda/envs/xtuner-env/lib/python3.10/site-packages/xtuner/model/modules/dispatch/qwen2.py", line 160, in qwen2_attn_forward [rank1]: attn_output = self._flash_attention_forward( [rank1]: File "/opt/conda/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1729, in getattr [rank1]: raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'") [rank1]: AttributeError: 'Qwen2FlashAttention2' object has no attribute '_flash_attention_forward'

遇到同样的问题，可以看看这个 https://github.com/InternLM/xtuner/blob/main/requirements/runtime.txt

Sep 22 '24 03:09 ldh127

Sequence parallel needs transformers <4.43.

Oct 06 '24 15:10 shiningliang