SuperK

Results 28 comments of SuperK

@feiyun1265 And you should ues vggish to extract the audio feature and contact them together then send to the model

https://github.com/tensorflow/models/tree/master/research/audioset If you are using youtube-8m , you don't need it. the image feature is 1024 dimensions and audio feature is 128 dimensions , you should use them all.

@SharoneDayan I think may be you should try to freeze the model.

编译优化选项都打开了吗、

最小检测框设为80*80,编译优化选项全开

你用的是620model吗,编译优化开到最大了吗

那我想不到别的了,可能跟计算设备有关吧,我之前测的服务器,CPU3.2GHz的。

你测的图贴出来,我测一下

![1](https://user-images.githubusercontent.com/8406285/30463247-1bdfc7c6-99fd-11e7-8cc5-4051b4912b33.jpg)