Osama Amjad
Results
2
comments of
Osama Amjad
These are my loss and learning rate curve for model training on single GPU with 2 batch size on Waymo dataset. After 2nd epoch it went NaN. ![Screenshot from 2024-04-27...
Thanks for reply. Another question is I investigated the optimizer code, you are distributing model parameters in to 2 groups one with 'block' keyword in them and other with all...