VGGVox icon indicating copy to clipboard operation
VGGVox copied to clipboard

about the pool_time layer

Open hktxt opened this issue 6 years ago • 1 comments

confused with pool_time layer. as said 'modify the payers to adapt to the spectrogram'. a input size of 512*300 with 3s segment, the resnet50 output 9* 8*2048d, and followed with 9*1*2048 fully connect layer. How does the 1*N avg pool layer work? this 9*1*2048 length fc layer has nothing to do with N. It can be followed by the fc2(5994) layer to the output... plz....

hktxt avatar Nov 09 '18 02:11 hktxt

the 9*1*2048 fc1 layer would output a feature of size 1*8*2048 if fed with an input of size 9*8*2048, N=8 in this situation

yuyq96 avatar Mar 28 '19 05:03 yuyq96