ens10
ens10 copied to clipboard
The results of T2m seem abnormal for Unet using batch size of 1
The loss and metric are extremely low when using the default batch size of 1 and default learning rate of 0.01 for Unet
However, the result seems normal when switching the batch size to 8 or switching the learning rate to 0.00005 for Unet.
Is there any wrong with the code?
I find that it is because the model may find a way to make the loss extremely low by making the prediction about std (output[:, 1]) extremely low. The is the output of output[:, 0] and torch.exp(output[:, 1])