nnAudio [Feature Request] Allow STFT kernels to be normalized

[Feature Request] Allow STFT kernels to be normalized

Open Manza12 opened this issue 2 years ago • 3 comments

I think it will be nice to have normalization tools for STFT kernels (they exist in CQT in the forward pass with the parameter normalization_type) in order to control the norm of the output.

If you want I can do a PR.

Nov 25 '21 20:11 Manza12

Yes please! It would be a great help. Then we would also need to add a few more test cases into the pytest file, if possible, to make sure the normalizations are correct. Currently I am using librosa as a reference, but it seems that librosa is not providing the normalization option either. Do you know other alternative references that we can compare our implementation with?

Nov 26 '21 02:11 KinWaiCheuk

Regarding tests, the only idea I have is to test STFT of pure sinusoidad functions, which will have a maximum value of 1/2 if the kernels are normalized with the L1 norm. We may also implement the wrap style, that "wraps" positive and negative frequencies to have a maximum value of 1. These normalization questions came up to me because when working with audio floating signals (then comprised between -1 and 1) I want the STFT (or the CQT) magnitude comprised between 0 and 1. It may be also interesting to have L2 normalization, that ensures an interesting property ilustrated in the next formula (from Foundations of Time-Frequency Analysis by K. Gröchenig) normalization_property

here, g is the window function and V_g stands for the STFT. We may also check that property in the tests, but we may have aproximation errors...

Nov 26 '21 10:11 Manza12

I think approximation errors should be good enough!

Nov 26 '21 10:11 KinWaiCheuk

nnAudio nnAudio copied to clipboard

[Feature Request] Allow STFT kernels to be normalized

nnAudio
nnAudio copied to clipboard