audio issues

torchaudio build fails due to caffe2 related error.

### 🐛 Describe the bug Originated from here. **Pls do not hastily close until torchaudio is built succesfully. The official insruction does not have enough information to get the build...

jdgh000

Use non-persistent buffers

7

### 🚀 The feature Suggest using `register_buffer()` with `persistent=False`, so the buffer (e.g. window of spectrogram) will not be included in module's state dict. ### Motivation, pitch When I add...

Thylane

help wanted

good first issue

triaged

FFT frequency bins obtained by `torch.linsapce` in `torchaudio.functional.melscale_fbanks`

4

### 🐛 Describe the bug I notice that in https://github.com/pytorch/audio/blob/main/src/torchaudio/functional/functional.py#L561-L568, ```python if norm is not None and norm != "slaney": raise ValueError('norm must be one of None or "slaney"') #...

Emrys365

Resampling from 8KHz to 16KHz generate nonexisting spectral components

4

### 🐛 Describe the bug Hi, I have found a abnormal situation when I try to use torchaudio to resample 8KHz speech data to 16KHz. The code I am using...

pengyizhou

stream reader that supports padded windows with correct overlap

1

### 🚀 The feature I've written an `AudioBlockReader` that wraps StreamReader to return chunks of audio that are padded left and right with valid data. ### Motivation, pitch Let's say...

tcwalther

Add power_to_db and db_to_power.

2

### 🚀 The feature Convert power spectrogram into db and the reverse. ### Motivation, pitch I'm working on replacing my Librosa usage with torchaudio and this is a missing function...

will-rice

TimeMasking does not work with unbatched input

### 🐛 Describe the bug ```python import torch from torchaudio.transforms import TimeMasking, FrequencyMasking x = torch.randn(80, 100) FrequencyMasking(10)(x) # this works TimeMasking(10)(x) # this doesn't ``` Error message ``` File...

gau-nernst

Implement SpecAugment's Time Warping and SpecAugment wrapper

1

### 🚀 The feature [TimeStretch](https://pytorch.org/audio/stable/generated/torchaudio.transforms.TimeStretch.html) is not in SpecAugment. It should be Time Warping instead. Time Warping does not change the spectrogram shape, but "warp" the content. I want to...

gau-nernst

torchaudio.compliance.kaldi.fbank

6

please support batch kaldi fbank computation/ "waveform (Tensor) – Tensor of audio of size (c, n) where c is in the range [0,2)" right now only single utt compute is...

qmpzzpmq

Kaldi

wip: Update LibriSpeech evaluation script to support ESPNet models

2

mthrok

CLA Signed

audio
audio copied to clipboard

Metadata

torchaudio build fails due to caffe2 related error.

Use non-persistent buffers

FFT frequency bins obtained by `torch.linsapce` in `torchaudio.functional.melscale_fbanks`

Resampling from 8KHz to 16KHz generate nonexisting spectral components

stream reader that supports padded windows with correct overlap

Add power_to_db and db_to_power.

TimeMasking does not work with unbatched input

Implement SpecAugment's Time Warping and SpecAugment wrapper

torchaudio.compliance.kaldi.fbank

wip: Update LibriSpeech evaluation script to support ESPNet models

← Metadata

Owner

Metadata

audio audio copied to clipboard

Metadata

← Metadata

Owner

Metadata

audio
audio copied to clipboard