audio-data-pytorch How to create a text and audio dataset

How to create a text and audio dataset

Open AI-Guru opened this issue 2 years ago • 1 comments

Hi!

First and foremost: congratulations on this fine collection of repositories! I am slowly working my way through them and I am amazed by how easy and effective your work is.

I will soon start some work on conditional audio generation. What would be a good starting point for creating something like a WAVDataset that would yield audio and text? Would it be the best way to just extend WAVDataset?

Best, Tristan

Apr 05 '23 18:04 AI-Guru

Hi @AI-Guru, thanks a lot!

A subclass of WAVDataset with extra text metadata would be a good starting option. I personally used a WebDataset (with the custom AudioWebDataset) which basically loads a set of tar files with numbered pairs of wav/json. WebDatasets work well with a lot of data, but it's a bit more involved to start with.

Apr 06 '23 17:04 flavioschneider

audio-data-pytorch audio-data-pytorch copied to clipboard

How to create a text and audio dataset

audio-data-pytorch
audio-data-pytorch copied to clipboard