hugo-xie

Results 1 comments of hugo-xie

> Yes it's normal So does the loss remain in a fluctuating state from the beginning to the end of training? How do we then choose the final model checkpoint