Paella icon indicating copy to clipboard operation
Paella copied to clipboard

grad exposure

Open liangbingzhao opened this issue 1 year ago • 0 comments

i train paella on MSCOCO, and downsize a little bit paella, to 247M parameters. But the training loss suddenly increases, and then to nan. wonder how to solve this problem. image

liangbingzhao avatar Nov 23 '23 12:11 liangbingzhao