audio
audio copied to clipboard
Data manipulation and transformation for audio signal processing, powered by PyTorch
## 🚀 Feature Make mono to stereo or stereo to mono conversion ## Motivation You Guys have made an amazing job, but stereo to mono and vice versa is simple,...
Implements a transformation for converting multi-channel audio to mono audio. The conversion is done by just taking the average of each channel and then dividing it by the number of...
### 🚀 The feature Hello! I want to use the finetuned HuBERT_base model. However, in torchaudio.pipelines, there has only HUBERT_ASR_LARGE and HUBERT_ASR_XLARGE. What should I do to get a HUBERT_ASR_BASE...
### 🚀 The feature A pure Pytorch implementation of the "convolution reverb" like described in https://pytorch.org/audio/stable/tutorials/audio_data_augmentation_tutorial.html#simulating-room-reverberation This should be implemented like "pitch shift" both in "functional" and as a module....
I wonder why the padding mode is hardcoded in functional.spectrogram? Maybe add a parameter to support reflect padding? https://github.com/pytorch/audio/blob/main/torchaudio/functional/functional.py#L117
### 🐛 Describe the bug The Kaldi feature extraction algorithm returns two tensors, `pitch` and `NCCF`: ```python import torch import torchaudio from torchaudio.utils import download_asset import torchaudio.functional as F SAMPLE_SPEECH...
I've received feedback that the versioning UI is both not obvious enough and also that it is not clear what it means. Adding some copy to the page to clarify.
### 🐛 Describe the bug The SpectralCentroid transform supports only real valued inputs. If complex values are provided, it yields an error : ```RuntimeError: Cannot have onesided output if window...
### 🐛 Describe the bug I am trying to modify the example/asr/librispeech_ctc_decode/inference.py to a batch mode. Here is my script: https://gist.github.com/yuekaizhang/f20904cfaf23e457a744f08ea19ce18e#file-inference_bug-py-L55 However, I found that with different batch_size, the WER...