AudioLoader
AudioLoader copied to clipboard
PyTorch Dataset for Speech and Music audio
AudioLoader
AudioLoader is a PyTorch dataset based on torchaudio. It contains a collection of datasets that are not available in torchaudio yet.
Currently supported datasets:
-
Speech
- Multilingual LibriSpeech (MLS)
- TIMIT
- SpeechCommands v2 (12 classes)
-
Automatic Music Transcription (AMT)
- MAPS
- MusicNet
- MAESTRO
-
Music Source Separation (MSS)
- FastMUSDB
- MusdbHQ
Example code
A complete example code is available in this repository. The following pseudo code shows the general idea of how to apply AudioLoader to your existing code.
from AudioLoader.speech import TIMIT
from torch.utils.data import DataLoader
# AudioLoader helps you to set up supported datasets
dataset = TIMIT('./YourFolder',
split='train',
groups='all',
download=True)
train_loader = DataLoader(dataset,
batch_size=4)
# Pass the dataset to you
model = MyModel()
trainer = pl.Trainer()
trainer.fit(model, train_loader)
Installation
pip install git+https://github.com/KinWaiCheuk/AudioLoader.git
News & Changelog
version 0.0.3 (10 Sep 2021):
- Replace broken links with a working links for
MAPS
andTIMIT
- Remove the slience indicators in the phonemic labels for TIMIT