audio
audio copied to clipboard
Data manipulation and transformation for audio signal processing, powered by PyTorch
### π The feature Is the module that utilizes nvenc for accelerated encoding considering support for the yuv420p format? ### Motivation, pitch I am using this module for accelerated video...
### π Describe the bug Following test are failing: https://app.circleci.com/pipelines/github/pytorch/audio/16491/workflows/e9d2d0de-56ba-42f8-804b-77bf26fa291f/jobs/1213093 unittest_windows_gpu_py3.8: ``` FAILED torchaudio_unittest\io\stream_reader_test.py::FilterGraphWithCudaAccel::test_scale_cuda_format - RuntimeError: Failed to create the filter from "scale_cuda=format=yuv444p" (Invalid argument.) FAILED torchaudio_unittest\io\stream_reader_test.py::FilterGraphWithCudaAccel::test_sclae_cuda_change_size - RuntimeError: Failed...
Leveraging StreamReader we can load video and images. We should add functions `torchaudio.io.load_image` `torchaudio.io.load_video` `torchaudio.io.load_audio` which are thin wrapper around `StreamReader`. (and perhaps `save` versions)
to include PTS and the difference of seek
### π Describe the bug When using `torchaudio.load` on a 1.79 GB FLAC file, it throws `RuntimeError: Trying to create tensor with negative dimension -225262592: [-225262592, 2]`. ```python import json...
### π The feature Add the capability for ffmpeg filters (-filter, -filter_complex) in StreamWriter and StreamReader according to ffmpeg filters: https://ffmpeg.org/ffmpeg-filters.html It'll be good to add an argument to set...
### π Describe the bug Snippet to reproduce the error is provided below. Adding `backend="sox"` or `backend="soundfile"` to `torchaudio.save` removes the issue. ```python import os from tempfile import NamedTemporaryFile os.environ["TORCHAUDIO_USE_BACKEND_DISPATCHER"]...
Here is the list of feature requests for StreamReader/Writer I have received so far. Feel free to add 1. [x] PTS support in StreamWriter #3135 When processing videos/audios, with StreamReader/Writer,...
### π Describe the bug `torchaudio.functional.resample()` doesn't work with complex types. Example: ``` x = torch.randn(1024, 2) x = torch.view_as_complex(x) y = torchaudio.functional.resample(x, 1.0, 2.0) ``` Gives error: ``` RuntimeError:...
While torchaudio provides a Mel-scaled spectrogram transformation (`torchaudio.transforms.MEL`), thereβre a few additional spectral feature transformations that are extremely useful for pre-processing and data augmentation. For example, two feature transformations that...