ChenyangSi
Results
11
comments of
ChenyangSi
@snowcement Yes, the outputs of Conv, maxpool and attention have different channel dimensions. We first do FFT with fft2() for each channel, then do a channel-wise average pooling.