ColossalAI
ColossalAI copied to clipboard
[BUG]: Hybrid kernel went wrong when there were both fp16 and fp32 gradients
🐛 Describe the bug
When model has both fp16 gradient and fp32 gradient, hybrid adam may unable to update parameters correctly.
Since we put all parameters to a list in colossalai/nn/optimizer/hybrid_adam.py. Here it is.
But in our CUDA kernel, writes like this.
We only check the dtype of the first element in the list. This may cause wrong dtype.
Environment
No response