NAFNet
NAFNet copied to clipboard
About the training speed
Thank you for your wonderful work. I use four blocks of 3090,and batch size per gpu is 4. It cost about 15 hours training NAFNet-width32 on SIDD, and about 3 days training NAFNet-width64. I wonder to know if this speed is normal?
Hi, meitounao110, Thanks for your attention to NAFNet. This speed is normal. But it is recommended to keep the overall batch size consistent with our config (to reproduce our results), e.g. 32(4x8) in NAFNet-width32, and 64(8x8) in NAFNet-width64.
Hi, meitounao110, Thanks for your attention to NAFNet. This speed is normal. But it is recommended to keep the overall batch size consistent with our config (to reproduce our results), e.g. 32(4x8) in NAFNet-width32, and 64(8x8) in NAFNet-width64.
thank you very much! Another question is that what the role of $\gamma$ and $\beta$ in the NAFBlock? It seems not be mentioned in the paper.
Hi, meitounao110, It is the skip-init we mentioned in our experimental section to stabilize training.
Hi, meitounao110, Thanks for your attention to NAFNet. This speed is normal. But it is recommended to keep the overall batch size consistent with our config (to reproduce our results), e.g. 32(4x8) in NAFNet-width32, and 64(8x8) in NAFNet-width64.
Hi, thanks for your excellent work! Can I ask a question about training NAFNet-width64 with REDS? I use 3 blocks of 3090 and other configs are the same as yours, and I train it for the same 400000 iterations. and PSNR=28.8342, ssim=0.8623, which is lower than that in the paper. Are there any other reasons that result in this situation? Or which iteration model you choose to get the best results as the paper shows(29.09, 0.867)? Thank you very much !
Hi, thanks for your excellent work! Can I ask a question about training NAFNet-width64 with REDS? I use 3 blocks of 3090 and other configs are the same as yours, and I train it for the same 400000 iterations. and PSNR=28.8342, ssim=0.8623, which is lower than that in the paper. Are there any other reasons that result in this situation? Or which iteration model you choose to get the best results as the paper shows(29.09, 0.867)? Thank you very much !
refer to https://github.com/megvii-research/NAFNet/issues/24#issuecomment-1196259074
Thank you for your wonderful work. I use four blocks of 3090,and batch size per gpu is 4. It cost about 15 hours training NAFNet-width32 on SIDD, and about 3 days training NAFNet-width64. I wonder to know if this speed is normal?感谢您的出色工作。我使用了四个 3090 块,每个 gpu 的批处理大小为 4。在 SIDD 上训练 NAFNet-width32 大约需要 15 个小时,训练 NAFNet-width64 大约需要 3 天。我想知道这个速度是否正常?
你好,请问一下在24g显存的3090上实验的时候你的batchsize设置的是多少,是否有复现出论文中的指标