audio issues

Make TextProcessor object inherit `torch.nn.Module`.

The TextProcessor object from TTS pipeline, should be able to return Tensors in specified device. Currently the returned Tensors have to be manually moved to the target device. ```python bundle...

mthrok

Updates necessary to WaveRNN training script

2

1. Get rid of `bg_iterator` https://github.com/pytorch/audio/blob/a6f9cf8babfb096381e914e23950371478672b3e/examples/pipeline_wavernn/main.py#L13 This is known to slow down the process, and was removed from the library. 2. Replace `DataParallel` with `DistributedDataParallel`. https://github.com/pytorch/audio/blob/a6f9cf8babfb096381e914e23950371478672b3e/examples/pipeline_wavernn/main.py#L325 `DataParallel` is officially deprecated...

mthrok

Moving WaveRNN padding into the model

Currently WaveRNN's forward method expects client code to to pad the input spectrogram to the specific size (`kernel_size - 1 // 2`). This breaks the encapsulation. The WaveRNN's forward method...

mthrok

ERROR: Could not find a version that satisfies the requirement torchaudio (from versions: none) ERROR: No matching distribution found for torchaudio

3

### 🐛 Describe the bug ERROR: Could not find a version that satisfies the requirement torchaudio>=0.5.0 (from asteroid) (from versions: none) ERROR: No matching distribution found for torchaudio>=0.5.0 (from asteroid)...

clort81

question

There is a contraction in WaveRNNInferenceWrapper during inference when batched

1

The waveform above is the ground truth and the one below is the generated waveform with `batched` set to true. See this [colab example](https://colab.research.google.com/drive/19-LC9MhNQFsoSH93b4hPWXXcycmoutmb?usp=sharing) ([internal](https://colab.research.google.com/drive/1d-kmdYALKDwGw-u5lTb8JAP7dyKFtWbk?usp=sharing) for more information).

yangarbiter

bug

Remove insignificant test assets

14

@astaff had introduced guideline for test assets in https://github.com/pytorch/audio/pull/759 and we can get rid of the following existing assets. - [x] `100Hz_44100Hz_16bit_05sec.wav` sine wave, should be replaced by on-the-fly generation....

mthrok

help wanted

module: tests

kaldi.fbank alternative in librosa?

1

## ❓ Questions and Help Hi everyone, I would really appreciate if someone could let me know how to replicate **compliance.kaldi.fbank()** function in **librosa**? I've gone through alot of literature...

AlexJian1086

cuda_version can't be changed in CI. It needs to be improved.

1

## 🐛 Bug In unittest_windows_gpu, there's an environment variable of CUDA_VERSION. https://github.com/pytorch/audio/blob/ecd068f583e4e45ef80635d975ac2bf38d4e819a/.circleci/config.yml.in#L514-L520 But the value couldn't be changed in fact, because it will be revert to 10.2 by regenerate.py https://github.com/pytorch/audio/blob/ecd068f583e4e45ef80635d975ac2bf38d4e819a/.circleci/regenerate.py#L212...

mszhanyi

Loading a BytesIO opus file does not seem to work

4

## 🐛 Bug Loading a BytesIO opus file does not seem to work. ## To Reproduce Steps to reproduce the behavior: ``` import torchaudio import io print(torchaudio.__version__) # samples from...

christopherhesse

bug

C++

module: IO

Add support for bitrate in sox_io backend

3

`soxi` shows the files bitrate, but I'm pretty sure `torchaudio.info` doesn't when using the `sox_io` backend. ## 🚀 Feature Create functionality within `torchaudio.info` and the `AudioMetadata` object to be able...

rbracco

audio
audio copied to clipboard

Metadata

Make TextProcessor object inherit `torch.nn.Module`.

Updates necessary to WaveRNN training script

Moving WaveRNN padding into the model

ERROR: Could not find a version that satisfies the requirement torchaudio (from versions: none) ERROR: No matching distribution found for torchaudio

There is a contraction in WaveRNNInferenceWrapper during inference when batched

Remove insignificant test assets

kaldi.fbank alternative in librosa?

cuda_version can't be changed in CI. It needs to be improved.

Loading a BytesIO opus file does not seem to work

Add support for bitrate in sox_io backend

← Metadata

Owner

Metadata

audio audio copied to clipboard

Metadata

← Metadata

Owner

Metadata

audio
audio copied to clipboard