BIPNet icon indicating copy to clipboard operation
BIPNet copied to clipboard

About batch size in training

Open woojin9605 opened this issue 2 years ago • 2 comments

Thank you for your great work!

I have some question about batch size. In the paper, there is no mention about batch size in experiment section. In your code, however, you set batch size as 1. Is there some reason why you set the batch size as 1? Because training time is too long and GPU memory usage is quite low...

I'm really looking forward to your answer.

woojin9605 avatar Jul 07 '22 02:07 woojin9605

I think it is designed intentionally for the pseudo burst architecture. If we modify the input to 5 dim (Batch, Burst, feat, H/2, W/2), we have to convert conv2d to conv3d causing different weights for each input. If we can make sure that the kernel weights for each image in the same batch can update simultaneously and remain the same, this might help accelerate the training time. Not sure whether I understand the problem well, looking forward to the response from the author.

Tony-Tseng avatar Jul 08 '22 12:07 Tony-Tseng

Yes, you understood correctly. As burst is of 4 dimensions, we don't have any alternative other than keeping batch size 1. One thing you can try is to combine burst dimension with batch dimension. With this, you can use conv2d. But, to do so you have to be very careful about feature processing through every module (especially in EBFA and PBFF modules).

akshaydudhane16 avatar Jul 11 '22 17:07 akshaydudhane16