Issue when I try to use nnUNet to other model
Hi, all.
I know this isn't your obligation, but just wanna post and see if any of you tried to do similar thing like me before.
I'm trying to use nnUNet framework with Swin-Unet, which is transformer-based network. This is what i encountered.
As you can see, all the loss become a and pseudo dice is nan, this seems cannot be modified, I tried several times.
I simple put Swin-Unet under build_network_architecture function (but only used when training, converting data is still unet framework, otherwise cannot success.)
Thank for any advice.
Hey, let me tag @TaWald and @saikat-roy here since they have the most experience with this kind of stuff. My 2 cents:
- nnU-Net default initial LR is way too high for those architectures
- you may want to use AdamW instead of SGD here
Best, Fabian
Hey @chenzhang9476. Just following up on @FabianIsensee here. In our experience, when we trained SwinUNet using nnUNet as the training framework, we had to reduce the learning rate to 1e-4. We did use AdamW as the optimizer instead of SGD. But my guess is that, you would probably need to reduce the learning rate on SGD as well.
Thank you.
But I’m confussing about the deep supervision.
On Tue, 28 May 2024 at 7:15 PM, Saikat Roy @.***> wrote:
Hey @chenzhang9476 https://github.com/chenzhang9476. Just following up on @FabianIsensee https://github.com/FabianIsensee here. In our experience, when we trained SwinUNet using nnUNet as the training framework, we had to reduce the learning rate to 1e-4. We did use AdamW as the optimizer instead of SGD. But my guess is that, you would probably need to reduce the learning rate on SGD as well.
— Reply to this email directly, view it on GitHub https://github.com/MIC-DKFZ/nnUNet/issues/2197#issuecomment-2134728362, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWAKKBLQZKYKEPZYRNGNOKDZERDMHAVCNFSM6AAAAABH3WHPXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZUG4ZDQMZWGI . You are receiving this because you were mentioned.Message ID: @.***>
Hey @chenzhang9476 . Can you clarify what you mean by confused? Are you trying to switch off deep supervision or are you trying to use it but are unsuccessful?
Is deep supervision compliable with the other new framework like Swin-Unet?
Hey @chenzhang9476. It is compatible in principle as long as you configure the underlying architecture/ model to provide deep supervision like outputs to the underlying trainer.
Are you trying to do this for SwinUnet? Can you tell us where you are stuck?
Hi, I tried training with another model using MONAI (e.g., SwinUNETR). Initially, the training was fine, but suddenly, the loss became NaN in the middle of the process. I already set the learning rate to 1e-4 and also tried 1e-5, using AdamW as well. Before updating nnUNet, I never encountered this error. Is there any way to fix this?