ColossalAI
ColossalAI copied to clipboard
[BUG]: Pytest with a specific config failed after PR #5868
Is there an existing issue for this bug?
- [X] I have searched the existing issues
🐛 Describe the bug
Main repo test_shard_llama fails for these configs:
{'tp_size': 2,
'pp_size': 2,
'sp_size': 2,
'num_microbatches': 2,
'enable_sequence_parallelism': True,
'sequence_parallelism_mode': 'ring',
'enable_flash_attention': True,
'zero_stage': 1,
'precision': 'fp16',
'initial_scale': 1}
{'tp_size': 2,
'sp_size': 2,
'pp_size': 2,
'num_microbatches': 2,
'enable_sequence_parallelism': True,
'sequence_parallelism_mode': 'split_gather',
'enable_flash_attention': False,
'precision': 'fp16',
'initial_scale': 1}
The failure message is :
E File "/home/nvme-share/home/zhangguangyao/ColossalAI/colossalai/shardformer/modeling/llama.py", line 530, in forward
E query_states, key_states = apply_rotary_pos_emb(query_states, key_states, cos, sin)
E File "/home/nvme-share/home/zhangguangyao/hf_transformers/src/transformers/models/llama/modeling_llama.py", line 206, in apply_rotary_pos_emb
E q_embed = (q * cos) + (rotate_half(q) * sin)
E RuntimeError: The size of tensor a (16) must match the size of tensor b (8) at non-singleton dimension 2
I have found out this failure is introduced after PR #5868 merged. Please take a look.
Environment
No response