aaaaammmmm
aaaaammmmm
### Your current environment ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 20.04.6 LTS (x86_64) GCC...
transformers == 4.44.2 xtuner == 0.1.23 训练 qwen2 时候, 使用 pack_to_max_length = False use_varlen_attn = False 会报错。 似乎transformers 中没有_flash_attention_forward了。 [rank1]: File "/opt/conda/envs/xtuner-env/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 655, in forward [rank1]: hidden_states, self_attn_weights, present_key_value...