dataloaders
dataloaders copied to clipboard

Published 20 hours ago •

→

Metadata

Pytorch and TensorFlow data loaders for several audio datasets

Readme
Issues

dataloaders

Pytorch and TFRecords data loaders for several audio datasets

Datasets

ESC - dataset of environmental sounds

LibriSpeech - corpus of read English speech

NSynth - dataset of annotated musical notes

[x] NSynth downloader and generator of *.h5py and *.tfrecord formats
[x] TFRecord reader
[x] PyTorch Dataset
[x] PyTorch Dataset for TFrecord
[x] PyTorch DataLoaders for TFRecord

VoxCeleb2 - human speech, extracted from YouTube interview videos

[ ] Pytorch loader
[ ] TFRecords loader

GTZAN - audio tracks from a variety of sources annotated with genre class

[x] GTZAN Downloader
[x] PyTorch DataSet

CallCenter - audio tracks with human and non-human speech

[x] PyTorch DataSet

For validation we frequently use the following scheme:

Read 10 random crops from a file;
Predict a class for each crop;
Averaging results.

For this scheme we've done additional DataLoaders for PyTorch:

About

Pytorch and TensorFlow data loaders for several audio datasets

pytorch

dataset

esc

audio-processing

dataloader

tfrecords

librispeech

gtzan

nsynth

109

Stars

12

Forks

Watchers

Owner

← Metadata

109

Stars

12

Forks

Watchers

Owner

Metadata

Pytorch and TensorFlow data loaders for several audio datasets