Manjunath

Results 2 comments of Manjunath

Hi, Even I am experiencing the same issue while using 7B model on Jupyter notebook. Logs attached below for reference. `> initializing model parallel with size 1 > initializing ddp...

Hi @justinxzhao , I tried with a **learning rate of 0.0001**. Same issue persists. ``` Training: 18%|█▊ | 719/4000 [22:29