audio icon indicating copy to clipboard operation
audio copied to clipboard

Confusing naming for argument `n_stft` in MelScale

Open felipeespic opened this issue 2 years ago • 2 comments

🐛 Describe the bug

The name and docstring of the argument n_stft in MelScale is confusing, because:

  • Just seeing the name, it looks equivalent to n_fft (just changes from FFT to STFT), which is a typical name used for the FFT size. Actually it is used with that purpose in torchaudio Spectrogram and librosa.feature.melspectrogram.
  • Moreover, the docstring for n_stft explicitly states "See n_fft in :class:Spectrogram", which looks as a confirmation that n_stft is indeed the FFT size.

IMO, a more appropriate name for this argument would be something like n_freqs. Actually, it is under that name in torchaudio.functional.melscale_fbanks

Versions

torch 2.0.0+cu117 torchaudio 2.0.1+cu117

(I think nothing else is relevant)

felipeespic avatar Oct 14 '23 00:10 felipeespic

Hey, good point, could you please assign me the task, I am willing to fix the issue, thanks

ivnvalex avatar Feb 15 '24 10:02 ivnvalex

Hi, Thanks. Maybe @carolineechen could assign @ivnvalex, please?

felipeespic avatar Apr 22 '24 23:04 felipeespic