audio icon indicating copy to clipboard operation
audio copied to clipboard

torchaudio.compliance.kaldi.fbank

Open qmpzzpmq opened this issue 4 years ago • 6 comments

please support batch kaldi fbank computation/ "waveform (Tensor) – Tensor of audio of size (c, n) where c is in the range [0,2)" right now only single utt compute is support

qmpzzpmq avatar Feb 07 '21 03:02 qmpzzpmq

Thanks for the feedback. Surely, this is very important and will try to address this. We are thinking to apply tweak on torchaudio.compliance.kaldi. We do not have an immediate action plan at the moment, but we will try to come back to this as soon as possible.

mthrok avatar Feb 08 '21 16:02 mthrok

@qmpzzpmq you can use torchaudio.transforms.MelSpectrogram as alternative

Oktai15 avatar Feb 17 '21 11:02 Oktai15

@Oktai15 hi, I just wondering if the result of theme are same? From description, these result looks difference.

qmpzzpmq avatar Feb 18 '21 00:02 qmpzzpmq

@qmpzzpmq for example, check this issue: https://github.com/pytorch/audio/issues/157#issuecomment-513872666

Oktai15 avatar Feb 18 '21 11:02 Oktai15

@Oktai15 thanks for your example, I will test them for same result. but it looks, still some parames to be check.

qmpzzpmq avatar Feb 19 '21 09:02 qmpzzpmq

i found the FBank can not run in async mode, who can fix this? thanks!

haha010508 avatar Sep 25 '23 09:09 haha010508