nnAudio icon indicating copy to clipboard operation
nnAudio copied to clipboard

[Feature Request] Allow STFT kernels to be normalized

Open Manza12 opened this issue 2 years ago • 3 comments

I think it will be nice to have normalization tools for STFT kernels (they exist in CQT in the forward pass with the parameter normalization_type) in order to control the norm of the output.

If you want I can do a PR.

Manza12 avatar Nov 25 '21 20:11 Manza12

Yes please! It would be a great help. Then we would also need to add a few more test cases into the pytest file, if possible, to make sure the normalizations are correct. Currently I am using librosa as a reference, but it seems that librosa is not providing the normalization option either. Do you know other alternative references that we can compare our implementation with?

KinWaiCheuk avatar Nov 26 '21 02:11 KinWaiCheuk

Regarding tests, the only idea I have is to test STFT of pure sinusoidad functions, which will have a maximum value of 1/2 if the kernels are normalized with the L1 norm. We may also implement the wrap style, that "wraps" positive and negative frequencies to have a maximum value of 1. These normalization questions came up to me because when working with audio floating signals (then comprised between -1 and 1) I want the STFT (or the CQT) magnitude comprised between 0 and 1. It may be also interesting to have L2 normalization, that ensures an interesting property ilustrated in the next formula (from Foundations of Time-Frequency Analysis by K. Gröchenig) normalization_property

here, g is the window function and V_g stands for the STFT. We may also check that property in the tests, but we may have aproximation errors...

Manza12 avatar Nov 26 '21 10:11 Manza12

I think approximation errors should be good enough!

KinWaiCheuk avatar Nov 26 '21 10:11 KinWaiCheuk