audio icon indicating copy to clipboard operation
audio copied to clipboard

Bark Filterbank for torchaudio

Open ahmed-fau opened this issue 3 years ago • 4 comments

🚀 The feature

Is there any plan/interest to enable Bark spectrogram calculation in torchaudio?

Motivation, pitch

More flexibility to users of torchaudio especially for ML-DSP purposes

Alternatives

No response

Additional context

No response

ahmed-fau avatar Dec 26 '21 08:12 ahmed-fau

Hi @ahmed-fau

Thanks for the request. Extending torchaudio in DSP domain is generally our interest. However I am new to Bark scale. Would you recommend any learning material?

After quick googling, and reading https://www.fon.hum.uva.nl/praat/manual/BarkSpectrogram.html, it seems that the procedure looks like the following.

Waveform -> power spectrogram -> Bark scale conversion

So adding Bark Filterbank (+ optionally, BarkSpectorgram) will suffice. Is that what you had in your mind?

mthrok avatar Dec 26 '21 13:12 mthrok

Hi @mthrok

Exactly, it's the same interface of MelSpectrogram but with a different Psychoacoustic scale (Bark instead of Mel, so adding Bark filterbank is all that we need).

The Bark scale is recently used in efficient neural speech synthesis models such as LPCNet.

For the sake of completeness, you can also add another argument for the ERB (equivalent rectangular bandwidth) scale, which is also used in recent neural speech enhancement systems such as PercepNet

ahmed-fau avatar Dec 26 '21 16:12 ahmed-fau

Hi @ahmed-fau

Sorry for the late response, but if you are still available, feel free to make a PR.

mthrok avatar Feb 07 '22 14:02 mthrok

So just come across tis post - so are we talking about RASTA-related spectrogram energies since we talk about bark scaling?

underdogliu avatar Apr 07 '22 19:04 underdogliu

Hi @mthrok, it's been a while since your comment about making a PR to include the Bark Spectrogram into torchaudio, and there has not been any response. I have implemented it, may I make a PR?

jdariasl avatar Oct 31 '22 11:10 jdariasl

Added as a prototype feature in #2823 and #2843, thanks @jdariasl

carolineechen avatar Nov 14 '22 16:11 carolineechen