audio
audio copied to clipboard
Data manipulation and transformation for audio signal processing, powered by PyTorch
Adopt `:autosummary:` to various modules * torchaudio.compliance.kaldi * torchaudio.sox_effects * torchaudio.utils
### 🚀 The feature @carolineechen The current RNNT loss takes logits as inputs. I wonder if it is possible to have a version that takes log probabilities rather than logits....
### 🐛 Describe the bug When using the two loading methods on the same audio file, the lengths of the waveform tensors are different. I can reproduce this issue with...
### 🐛 Describe the bug StreamReader Failed to open the input io.BytesIO which is save by `torchaudio.save` ```python import io import torchaudio from torchaudio.io import StreamReader wav_file = "demo.wav" streamer...
### 🚀 The feature The [Modified Discrete Cosine Transform (MDCT)](https://en.wikipedia.org/wiki/Modified_discrete_cosine_transform) is a perfectly invertible transform that can be used for feature extraction. It can be used as an alternative to...
### 🐛 Describe the bug Directly load `.mp3` audio with `torchaudio.sox_effects.apply_effects_file` will fail: ```python import torchaudio file = "clips/common_voice_id_25649986.mp3" effects = [['speed', '0.9'], ['rate', '48000']] torchaudio.sox_effects.apply_effects_file(file, effects) # output: #...
Increase timeout for the conda installs. I am observing following error on linux conda install: ``` Too long with no output (exceeded 10m0s): context deadline exceeded ``` Ref: https://app.circleci.com/pipelines/github/pytorch/audio/12563/workflows/a99a4f55-4006-406a-9d2a-89a24311aa0c/jobs/899681