diffusion_policy
diffusion_policy copied to clipboard
Some Problems About GroupNorm
Hello! Whether in your task or some other tasks I have chosen, I have found that the visual model of the diffusion policy has significantly improved the prediction of action after replacing batchnorm with groupnorm, and the same is true for the stability of training. When using batchnorm, the effect shown by the model is that the prediction of noise is getting better and converging during the training phase, but the prediction of action is not synchronized and has been very unstable. Your paper also mentioned the need to replace batchnorm, but did not give a theoretical explanation for this change. Can you give me some academic explanation or inspiring guidance on this phenomenon?
Thanks for your post @Selen-Suyue . It seems like this repo is missing the groupnorm + spatial softmax mentioned in the paper.
What settings did you use for groupnorm? @Selen-Suyue