DeepSpeaker-pytorch Question regarding filter bank

Great job on implementing paper!

Question: why did you use python_speech_features.fbank instead of librosa.feature.melspectrogram ? Both transformations are the same, right?

Mar 30 '18 13:03 evaldsurtans

they are almost same except the fbank discards the IDFT procedure.so it can maintain more source info of the audio.

Jun 05 '18 07:06 onceforall

Thank you for answering, I suspect that you mistaken melspectrogram/filterbank for MFCC. Because as far as I understand you use IDFT only for MFCC. So python_speech_features.fbank should be the same as librosa.feature.melspectrogram right? https://librosa.github.io/librosa/generated/librosa.feature.melspectrogram.html https://librosa.github.io/librosa/generated/librosa.feature.mfcc.html

Jul 11 '18 10:07 evaldsurtans