vak icon indicating copy to clipboard operation
vak copied to clipboard

ENH: add BaseVocalDataset that uses `vocles`

Open NickleDave opened this issue 3 years ago • 0 comments

related to #446

we do already have a base VocalDataset but it's basically just used for prediction

there should be something like a base Dataset class similar to the hierarchy in torchvision that has an init that expect to get a path to a vocles dataset and then keeps that as an attribute

Two sub-classes would be AudioDataset and SpectrogramDataset, that each return as an __item__ the audio or spectrogram + any corresponding annotation from the row. We could just always return a dict with audio / spect and annot and let annot be None for unannotated data. This removes the need to have a separate dataset for prediction

Then e.g. a BFSongRepo dataset would sub-class the SpectrogramDataset? But then we'd need to actually provide spectrograms :thinking:

NickleDave avatar Jul 08 '22 03:07 NickleDave