audio
audio copied to clipboard
Data manipulation and transformation for audio signal processing, powered by PyTorch
### 🚀 The feature @mthrok, I'd like to propose the integration of Intel GPU decoder and encoder support into Torio/Torchaudio's ffmpeg. This would provide native Intel GPU support for users...
### 🐛 Describe the bug ### ISSUE When I run `python preprocess_lrs3.py --data-dir=D:/BaiduNetdiskDownload/LRS3 --detector=retinaface --dataset=lrs3 --root-dir=D:/pycharmProject/audio_vision/audio-main/examples/avsr/predata --subset=test --seg-duration=16 --groups=4 --job-index=0` The following appears `D:\anaconda3\envs\davsr\lib\site-packages\torchaudio\backend\utils.py:62: UserWarning: No audio backend is available....
### 🚀 The feature I am proposing to add a `torch.nn.Module` transform that automatically crops/pads signals (with different options for padding such as constant/mirroring). I have the implementation already local...
### 🐛 Describe the bug The new `torch.compile` feature does not work with `torchaudio.functional.lfilter`. My understanding is that `torch.compile` needs to know the shapes of the tensor, but these shapes...
Hi I created an issue (#3707) where I detailed a bug with the [librispeech_conformer_rnnt](https://github.com/pytorch/audio/tree/main/examples/asr/librispeech_conformer_rnnt) ASR example. After some digging I found the reason for that error: the variable batch looses...
## 🐛 Bug In a project that I am working on, I need to keep the spectrogram and the signal aligned. I provided a small script below. The issues that...
### 🐛 Describe the bug Both `torchaudio.functional.pitch_shift` and `torchaudio.transforms.PitchShift` occupy excessive amount of GPU memory, which is not cleared, while working fine on CPU. #### ISSUE 1 The following piece...
### 🐛 Describe the bug I kept getting following errors, can someone shed a light? I can' see any hint of logs etc., ``` rm -rf build ; CXX=hipcc USE_FFMPEG=0...
### 🐛 Describe the bug Batch inference with WavLM triggers AssertionError in `WavLMSelfAttention` module. ```python import torchaudio wavlm=torchaudio.pipelines.WAVLM_LARGE.get_model().cuda() wavlm.extract_features(torch.randn(2,16000,device='cuda'),lengths=torch.tensor([2000,3000],device='cuda'),num_layers=1) ``` Log: ``` AssertionError Traceback (most recent call last) [](https://localhost:8080/#) in...