audioMotion-analyzer icon indicating copy to clipboard operation
audioMotion-analyzer copied to clipboard

[Feature Request] Constant-Q Transform, custom FFT and perceptual frequency scales

Open TF3RDL opened this issue 2 years ago • 7 comments

Although FFTs are fine, it gets really boring for me, so the constant-Q transform (actually the variable-Q transform) is preferred over FFT for octave band analysis, but my implementation of CQT (implemented using bunch of Goertzel algorithm) is slow and it needs to use a sliding DFT to do the real-time CQT

I also aware that spectrum analyzers on Web Audio API doesn't need to use AnalyserNode.getByteFrequencyData, you can just use any FFT library and getFloatTimeDomainData as an input just like my sketch does that, but beware you need to window it using Hann window or something before using FFT, see #3

I think perceptual frequency scales like Mel and Bark should be added because the bass frequencies are less shown than logarithmic scale and more shown than linear scale

TF3RDL avatar Apr 07 '22 06:04 TF3RDL

Thank you for letting me know about these techniques! Looks like I have a lot to catch on! 😅

Also, thank you for sharing your sketch! It made me realize that using linear values for the amplitude (instead of dB) makes a huge difference in visualization. I'll have this added as an option in the next release. Next, I think weighting filters would also be a good addition.

Can you recommend any good references for equations/algorithms of the CQT/variable-Q transform, perceptual scales and weighting filters?

Cheers!

hvianna avatar May 07 '22 20:05 hvianna

The equation for Bark scale is from Traunmüller's work, and the A-weighting as well as other things is already covered on Wikipedia

As for the constant-Q transform, I prefer the sliding DFT, which works best for real-time audio visualization and it even has a paper for it

TF3RDL avatar May 18 '22 12:05 TF3RDL

Here's the problem that I realized before you implementing the CQT; the Brown-Puckette would require real/imag parts, which AnalyserNode doesn't have (as getByteFrequencyData/getFloatFrequencyData only outputs logarithmic magnitude values), thus it requires custom FFT functionality (which can be implemented using any FFT libraries including ones like this that bundled with FFT functions), and implementing the sliding CQT requires AudioWorklets since it doesn't work well with getFloatTimeDomainData as waveform data to process

TF3RDL avatar Dec 05 '22 01:12 TF3RDL

@TF3RDL Thanks for following up on this!

For the next beta release, I've done some improvement to the linear amplitude mode and I'm finishing up the work on the weighting filters. I'll try to take a look at the perceptual scales next.

hvianna avatar Dec 10 '22 14:12 hvianna

As for the custom FFT, this could allow non-power of two sizes, zero-padding, and use different FFT streams or even non-audio data as an input (as custom FFT doesn't depend on Web Audio API), not just window functions right?

Not sure about the performance impact of using custom FFT over getByteFrequencyData/getFloatFrequencyData, but I do know that non-power of two FFTs are noticeably slower

TF3RDL avatar Dec 30 '22 17:12 TF3RDL