Rishi

Results 2 comments of Rishi

> [@quangkmhd](https://github.com/quangkmhd) I'm training on a 24GB memory card. Setting `gradient_accumulation_steps` to 1 results in an insufficient memory error, but setting it to 4 results in another error. Raw dataset...

> [@quangkmhd](https://github.com/quangkmhd) [@csqqlee](https://github.com/csqqlee) The error "ValueError: matrix contains invalid numeric entries" means your training is diverging. You can try the usual mitigations: bigger batch size, lower learning rate. > >...