ColossalAI
ColossalAI copied to clipboard
[FEATURE]: support for gradient clipping by value
Describe the feature
Is there any plan for supporting clip_grad_value in Pytorch?
Can you post more information about the PyTorch clip_grad_value? How to use it? Maybe some code snippet is more helpful.
Hi @haofanwang , its torch version shall work well with ColossalAI. In addition, we have our own clip_grad_norm in colossalai.utils.common for the correctness of model parallelism.