vggvox-speaker-identification about MFCC

@linhdvu14 Hi, thanks for your code. I know you are using the model with weight from VGGVOX, but where is the MFCC process? Or you use different features?

Jan 01 '19 09:01 TTTJJJWWW

Hi, VGGVox doesn't use MFCC, only FFT spectrum. The signal processing code is in sigproc.py.

Jan 01 '19 15:01 linhdvu14

@linhdvu14 Hi,Thank you for your reply. I am doubtful about "VGGVox doesn't use MFCC", because the source code of VGGVOX contain the MFCC function(from MFCC folder) and use it : function [ SPEC ] = mfccspec( speech, fs, Tw, Ts, alpha, window, R, M, N, L ) % MFCC Mel frequency cepstral coefficient feature extraction. ...

Jan 03 '19 02:01 TTTJJJWWW

Yes but if you look at the code of mfccspec, the return value SPEC is only FFT.

Jan 03 '19 03:01 linhdvu14

Oh I see. So you mean that the features of wav are inputed in model as a image （grey-scale image)? And the system essentially calculates the similarity (distance) of the image?

Jan 03 '19 09:01 TTTJJJWWW

@linhdvu14 Hi, did the "weights.h5" store both the architecture and weights, or just weights? I want to convert to a TensorFlow model(.pd). Can I just use "keras_to_tensorflow" tools to do it? Look forward to your reply.

Jan 04 '19 09:01 TTTJJJWWW

It's just weights. You'd probably want to export both weights and architecture before trying keras_to_tensorflow. Or replicate the model architecture in tf and restore weights from a dict.

Jan 06 '19 01:01 linhdvu14

vggvox-speaker-identification vggvox-speaker-identification copied to clipboard

about MFCC

vggvox-speaker-identification
vggvox-speaker-identification copied to clipboard