soundnet icon indicating copy to clipboard operation
soundnet copied to clipboard

A question about the output of visual CNN.

Open pangwenfeng opened this issue 7 years ago • 0 comments

Hi, thanks for your nice paper. I met a question that in your paper you say the numbers of frames of the videos are variable. So how do you fuse the CNN output from different frames so the length of last output is a constant? Just computing the average or something else? Thank you very much.

pangwenfeng avatar Dec 04 '17 15:12 pangwenfeng