audio issues

Librispeech ctc recipe

4

Added a Librispeech ctc recipe for demonstration purpose. - This recipe demonstrates using either [torch.nn.CTCLoss](https://pytorch.org/docs/stable/generated/torch.nn.CTCLoss.html) or [k2.ctc_loss](https://k2-fsa.github.io/k2/python_api/api.html#ctc-loss). Both can converge to similar results. - It supports using either CTC or...

huangruizhe

CLA Signed

Supporting music use cases in TorchAudio

4

Hello all! Currently, TorchAudio doesn’t provide much support for music use cases. We’d like to gauge interest from the community in our improving that support. Some requests we’ve received include...

hwangjeff

RFC

StreamReader seek frame number

1

### 🚀 The feature It would be nice if seek could go to a specific frame rather than the timestamp. ### Motivation, pitch I'm taking strided chunks from an audio...

lminer

StreamReader with h264_cuvid decoder has default output as bgr24 instead of rgb24

2

### 🐛 Describe the bug Not a serious bug. Different to what the documentation says, I had to specify `format="rgb24"` to force the RGB colorspace it. ### Versions ``` Collecting...

elmuz

LoadHIP.cmake file cannot be used in rocm5.6

### 🐛 Describe the bug When building Torchaudio with rocm5.6.0, cmake couldn't find the path to rocrand ``` CMake Error at cmake/LoadHIP.cmake:138 (find_package): By not providing "Findrocrand.cmake" in CMAKE_MODULE_PATH this...

Fourish

module: rocm

Hubert Pre-Training Example : Unable to load the saved checkpoint and resume training

5

### 🐛 Describe the bug Issue with Hubert Pre-training scripts in : https://github.com/pytorch/audio/tree/main/examples/hubert I am unable to resume the training by loading the latest "End of the epoch" checkpoints. I...

varun-krishnaps

Floating point exception (core dumped) on torchaudio.load

3

### 🐛 Describe the bug I also encountered the same problem as [2870](https://github.com/pytorch/audio/issues/2870), which caused a Floating point exception (core dumped) when loading ADPCM encoded audio and caused the service...

zengruizhao

Usage of `TimeStretch` is incorrect in documentation.

### 🐛 Describe the bug In an example of https://pytorch.org/audio/stable/transforms.html, `TimeStretch` takes arguments as follows: ```python TimeStretch(stretch_factor, fixed_rate=True) ``` This usage is incorrect. According to https://pytorch.org/audio/stable/generated/torchaudio.transforms.TimeStretch.html, this class takes the...

tky823

GPU Video Encoder Regression - RGB24

Ref https://github.com/pytorch/audio/issues/3317#issuecomment-1540433493 GPU encoder used to accept RGB24, but with the recent refactoring in main branch it expect RGBA32. Extra padding option to convert RGB24 to RGBA32 should be added...

mthrok

[PoC] use pytest configuration file

1

This is #3082 with an added `pytest` configuration as suggested in https://github.com/pmeier/pytest-results-action/issues/9#issuecomment-1573258305.

pmeier

CLA Signed

audio
audio copied to clipboard

Metadata

Librispeech ctc recipe

Supporting music use cases in TorchAudio

StreamReader seek frame number

StreamReader with h264_cuvid decoder has default output as bgr24 instead of rgb24

LoadHIP.cmake file cannot be used in rocm5.6

Hubert Pre-Training Example : Unable to load the saved checkpoint and resume training

Floating point exception (core dumped) on torchaudio.load

Usage of `TimeStretch` is incorrect in documentation.

GPU Video Encoder Regression - RGB24

[PoC] use pytest configuration file

← Metadata

Owner

Metadata

audio audio copied to clipboard

Metadata

← Metadata

Owner

Metadata

audio
audio copied to clipboard