ColossalAI
ColossalAI copied to clipboard
[BUG]: Shardformer FP8 communication training accuracy degradation
Is there an existing issue for this bug?
- [X] I have searched the existing issues
🐛 Describe the bug
TP+Split Gather(Acc) 4GPU Original FP16 Model: 0.755 FP8 Communication: 0.737
Environment
No response