audio
audio copied to clipboard
Data manipulation and transformation for audio signal processing, powered by PyTorch
`InverseMelScale` uses SGD inside so it does not work when the global context is `no_grad` or `inference_mode`. Or even when `requires_grad=False` would make it fail. This gives bad UX for...
Hi! I have had some issues using InverseMelScale Firstly, I used the transform on a Spectrogram, without taking the log or using AmplitudeToDB on the spectrogram. This resulted in very...
## Context: TorchAudio uses dual-binding (PyBind11 and TorchBind) to make custom operations available in Python. The both binding eventually calls the same implementation contained in `libtorchaudio[_XXX].so`. The ones bound via...
The dataset described in CSV file is Posix-style path, which requires OS-agnostic handling on Windows.
### 🐛 Describe the bug Cleanup conda channel flags, make sure we can switch easily between pytorch-nightly, pytorch-test and pytorch we have following logic in torchaudio and torchtext: https://github.com/pytorch/audio/blob/main/packaging/pkg_helpers.bash#L210 There...
Hi, I try exporting the process of feature extraction to onnx: ``` import torch import torchaudio model = torchaudio.transforms.MelSpectrogram() x = torch.randn(1, 16000) torch.onnx.export(model, x, 'tmp.onnx', input_names=['input'], output_names=['output']) ``` and...
[VoxPopuli](https://github.com/facebookresearch/voxpopuli) publishes pre-trained models of many different languages under [CC BY-NC 4.0](https://github.com/facebookresearch/covost/blob/main/LICENSE) license. We can add them to torchaudio. ## non-fine-tuned weights https://github.com/facebookresearch/voxpopuli#wav2vec-20 - [ ] es - base -...
This issue is to track the follow-up work to #1137, which introduced `rnnt_loss` and `RNNTLoss` as a [prototype](https://pytorch.org/audio/stable/index.html) in `torchaudio.prototype.transducer` using [HawkAaron's warp-transducer](https://github.com/HawkAaron/warp-transducer). - Update documentation - [ ] Guard...