vggvox-speaker-identification
vggvox-speaker-identification copied to clipboard
about the conv_bn_dynamic_apool
I read your code and found that the 9*1 is a conv layer in conv_bn_dynamic_apool() function. The paper says "replaced by two -layers-a fully connected layers of 9*1 and an average layer with 1/*8..." I stuck on this for a long time. Maybe you are right, that is a conv layer, which make sense.
another question is why K.l2_normalize ?
The wavreader function produce different result against with matlab.
FileNotFoundError: File b'cfg/enroll_list.csv' does not exist ? can you help me ?
Pretty sure I got the layer structure by following the Matlab model. Will check/update when I got more time.