BIPNet
BIPNet copied to clipboard
About batch size in training
Thank you for your great work!
I have some question about batch size. In the paper, there is no mention about batch size in experiment section. In your code, however, you set batch size as 1. Is there some reason why you set the batch size as 1? Because training time is too long and GPU memory usage is quite low...
I'm really looking forward to your answer.
I think it is designed intentionally for the pseudo burst architecture. If we modify the input to 5 dim (Batch, Burst, feat, H/2, W/2), we have to convert conv2d to conv3d causing different weights for each input. If we can make sure that the kernel weights for each image in the same batch can update simultaneously and remain the same, this might help accelerate the training time. Not sure whether I understand the problem well, looking forward to the response from the author.
Yes, you understood correctly. As burst is of 4 dimensions, we don't have any alternative other than keeping batch size 1. One thing you can try is to combine burst dimension with batch dimension. With this, you can use conv2d. But, to do so you have to be very careful about feature processing through every module (especially in EBFA and PBFF modules).