audio issues

Add a feeze option in Wav2Vec2 and HuBERT bundles

15

### 🚀 The feature In some research cases, the Wav2Vec2 or HuBERT is expected to be frozen (i.e. make ``reuqires_grad=False`` for all params). - Users use it as a feature...

nateanl

improvement

module: pipelines

Investigate the sox_effect test stuck in ThreadPoolExecutor.

https://github.com/pytorch/audio/pull/2025

mthrok

Add a DNN beamformer training pipeline to demonstrate usage of torchaudio.transforms.MVDR

### 🚀 The feature Recently torchaudio supported mask-based MVDR beamforming module, which takes the multi-channel noisy STFT and the estimated Time-Frequency masks as the input, and generates the single-channel enhanced...

nateanl

Update speech recognition tutorial

This is a reminder to update speech recognition tutorial before the release of v0.11. We have added model surgery to pre-trained wav2vec2 model so as to remove unused dimensions at...

mthrok

Replace _get_mat_trace when torch.linalg.trace is ready to use.

### 🚀 The feature In ``torchaudio.transforms.MVDR`` the trace of the multi-dimensional tensor is computed via a ``_get_mat_trace`` method due to the lack of PyTorch support. There is an [ongoing PR](https://github.com/pytorch/pytorch/pull/62714)...

nateanl

enhancement

Add xfolding to tacotron2 infer pipeline

2

In case of vocoding one example, by folding the input example into batch of chunks, the inference can run faster. https://github.com/pytorch/audio/blob/31dbb7540c78fe5d176948764cf9a20f55ac80dc/examples/pipeline_wavernn/wavernn_inference_wrapper.py#L167-L177 I excluded it from the initial tacotron2 pipeline, due...

mthrok

nellorebhanuteja

wontfix

[TEST] Smoke test fix

carolineechen

cla signed

ciflow/default

audio
audio copied to clipboard

Metadata

Add a feeze option in Wav2Vec2 and HuBERT bundles

Investigate the sox_effect test stuck in ThreadPoolExecutor.

Add a DNN beamformer training pipeline to demonstrate usage of torchaudio.transforms.MVDR

Update speech recognition tutorial

Replace _get_mat_trace when torch.linalg.trace is ready to use.

Add xfolding to tacotron2 infer pipeline

Add HiFi-GAN

Add CTC segmentation

Interactive ASR demo not working

[TEST] Smoke test fix

← Metadata

Owner

Metadata

audio audio copied to clipboard

Metadata

← Metadata

Owner

Metadata

audio
audio copied to clipboard