Shengyun-Si
Shengyun-Si
> Hi, I have met a problem, when I finetune Gemma2-2b using trainsformers.trainer, I find the lr is always 0, and grad_norm is nan:  so what's wrong? I using...