HAT
HAT copied to clipboard
There is a bug?(Maybe) when deal with the dataset in the size of 240*240 with the window_size 15
Hi, Thanks for the great work to solve the SR problem via the multi-scale QKV attention module!
When I train the model with my own datasets that the resolution of the LR imgs is 6060, while the HR imgs is 240240, so I need to change the 'window_size' in the options/train/xxx.yml as 15.
After this operation, I met some problems in class OCAB(), that in about line 394 in hat_arch.py:
def forward(self, x, x_size, rpi):
....
kv_windows = self.unfold(kv)
....
the kv_widows is in shape of [4, 174240, 9], that nw*b of k and v is 36 that is not be same as q which is 64.
I found that the problem would happend in nn.unfold(), when window size is odd, padding and kernel size could changed as:
class OCAB(nn.Module):
....
def forward(self, x, x_size, rpi):
....
#self.overlap_win_size = int(window_size * overlap_ratio) + window_size
self.overlap_win_size = int(math.ceil(window_size * overlap_ratio)) + window_size
....
#self.unfold = nn.Unfold(kernel_size=(self.overlap_win_size, self.overlap_win_size), stride=window_size, padding= (self.overlap_win_size-window_size)//2)
self.unfold = nn.Unfold(kernel_size=(self.overlap_win_size, self.overlap_win_size), stride=window_size, padding=-(-(self.overlap_win_size-window_size)//2))
....
and
class HAT(nn.Module):
....
def calculate_rpi_oca(self):
#window_size_ext = self.window_size + int(self.overlap_ratio * self.window_size)
window_size_ext = self.window_size + int(math.ceil(self.overlap_ratio * self.window_size))
The model is trained normally, but I don't know if these corrections are right or not?