FANet
FANet copied to clipboard
why only 2 frames for spatial-temporal context aggregation
Hi. Your paper is very interesting and inspirational to read. I was wondering why you just integrated the features of ONE neighboring frame to facilitate the inference of current frame. Have you experimented on more frames? What's the effect?
Thank you.