The results of T2m seem abnormal for Unet using batch size of 1

Open ShileiCao opened this issue 10 months ago • 1 comments

The loss and metric are extremely low when using the default batch size of 1 and default learning rate of 0.01 for Unet

However, the result seems normal when switching the batch size to 8 or switching the learning rate to 0.00005 for Unet.

Is there any wrong with the code?

Feb 23 '25 07:02 ShileiCao

I find that it is because the model may find a way to make the loss extremely low by making the prediction about std (output[:, 1]) extremely low. The is the output of output[:, 0] and torch.exp(output[:, 1])

Feb 23 '25 08:02 ShileiCao