Bo Huang
Bo Huang
When I use the amp-opt-level O1 to train the swin-large_patch4_window7_224 on imagenet22k, I get a nan loss and grad_norm ever since epoch [1/60] iter [880/3466]。The training process is normal before,...
It seems that the original image instead of the cropped image will be fed into the blob if you skip the rotation process(set rotation_interval=1). In src/caffe/data_transformer.cpp, the cv_img is cropped...
I write a shuffle-large config following swin-large and training on ImageNet22K dataset using apex O1. But the training process is unstable and the loss quickly become NAN. Is there any...