aeneas
aeneas copied to clipboard
Bug in FFT windowing
You windowing for the fft when calculating the mfccs is incorrect, which will reduce performance a little (dtw seems very robust). The default runtime configuration results in frame_length = 1600
which is greater than fft_order = 512
. However, this results in the last frame_length - fft_order = 1088
elements being chopped off the hamming window. You actual window is only the first third of the hamming window, which is a very bad choice of window.
https://github.com/readbeyond/aeneas/blob/4d200a050690903b30b3d885b44714fecb23f18a/aeneas/mfcc.py#L218
https://github.com/readbeyond/aeneas/blob/4d200a050690903b30b3d885b44714fecb23f18a/aeneas/mfcc.py#L230 besides the point, this should be
self.hamming_window = numpy.hamming(frame_length)
Your padding is appended only after you do the windowing https://github.com/readbeyond/aeneas/blob/4d200a050690903b30b3d885b44714fecb23f18a/aeneas/mfcc.py#L192
Same with the c code.
https://github.com/readbeyond/aeneas/blob/4d200a050690903b30b3d885b44714fecb23f18a/aeneas/cmfcc/cmfcc_func.c#L519
https://github.com/readbeyond/aeneas/blob/4d200a050690903b30b3d885b44714fecb23f18a/aeneas/cmfcc/cmfcc_func.c#L571
https://github.com/readbeyond/aeneas/blob/4d200a050690903b30b3d885b44714fecb23f18a/aeneas/cmfcc/cmfcc_func.c#L575-L585
It's hard to say what to change, for accuracy I imagine you would want keep the window_length the same so this requires a higher fft order, however for speed you would want to keep the fft_order down which I imagine is how this incorrectness came about.
I think the best solution is to do away with the windowing all together i.e. use the rectangular window as it's the widest