audio issues

FLAC save/load is broken with in-memory buffers and `sox_io` backend

4

### 🐛 Describe the bug ```python from io import BytesIO import torch import torchaudio torch.manual_seed(0) torchaudio.set_audio_backend("sox_io") sr = 16000 N = sr # in case you can't reproduce, try increasing...

pzelasko

[WIP] Reducing dependency on sox: silence

This PR to keep track on the progress of implementing the silence function in sox (#260). Feel free to suggest any changes to the implementation :)

haideraltahan

cla signed

Encountered undefined symbol: gsm_create when import torchaudio

16

### 🐛 Describe the bug I built torchaudio from source following https://github.com/pytorch/audio/blob/main/CONTRIBUTING.md. The build was successful, but when I import torchaudio in python, I got the following error: ``` >>>...

BriansIDP

needs triage

Add simulate_rir_ism method for simulating RIR with Image Source Method

2

nateanl

cla signed

Added workflow for building torchaudio wheels.

DanilBaibak

cla signed

Implement L-BFGS-B optimizer and update InverseMelScale

10

### 🚀 The feature To increase the speed of `InverseMelScale` module, the SGD optimization can be replace with ` torch.linalg.lstsq`. ### Motivation, pitch The current `InverseMelScale` module applies SGD optimizer...

nateanl

help wanted

triaged

Backwards breaking change for MP3 loading in 0.12

### 🐛 Describe the bug It is written pretty clearly in the release notes that there is a breaking change when loading MP3 files: ``` MP3 decoding is now handled...

patrickvonplaten

Feedback for wav2letter pipeline

2

Addresses feedback from pytorch#632 Closes vincentqb/audio#2

vincentqb

cla signed

Room Impulse Response Simulation Support in TorchAudio

7

For release 2.0, we plan to add support for multi-channel room impulse response simulation methods under `torchaudio.functional`. The implementation is based on [pyroomacoustics](https://github.com/LCAV/pyroomacoustics), that supports both "image source method", and...

nateanl

new feature

torchaudio.compliance.kaldi.fbank dither is different from kaldi

4

### 🐛 Describe the bug for the function "torchaudio.compliance.kaldi.fbank", there is an option "dither". The function calls _get_window() function, where dither leads to adding random number in strided_input: ![image](https://user-images.githubusercontent.com/24697257/185326852-4fbdd591-e394-43a5-8c06-6200ee831659.png) Since...

yzwu2017

triaged

audio
audio copied to clipboard

Metadata

FLAC save/load is broken with in-memory buffers and `sox_io` backend

[WIP] Reducing dependency on sox: silence

Encountered undefined symbol: gsm_create when import torchaudio

Add simulate_rir_ism method for simulating RIR with Image Source Method

Added workflow for building torchaudio wheels.

Implement L-BFGS-B optimizer and update InverseMelScale

Backwards breaking change for MP3 loading in 0.12

Feedback for wav2letter pipeline

Room Impulse Response Simulation Support in TorchAudio

torchaudio.compliance.kaldi.fbank dither is different from kaldi

← Metadata

Owner

Metadata

audio audio copied to clipboard

Metadata

← Metadata

Owner

Metadata

audio
audio copied to clipboard