act-plus-plus
act-plus-plus copied to clipboard
When the loss is NaN during training
I use batch size = 32 ir = 1e-5
who can help me solve this question?thx