vak icon indicating copy to clipboard operation
vak copied to clipboard

ENH: Add RandomCrop transform

Open NickleDave opened this issue 3 years ago • 0 comments

#555 proposes to deprecate WindowDataset.

One good thing about the WindowDataset abstraction is that kind of gives us data augmentation and thus translational invariance "for free", because a network can see literally every possible window in the dataset.

We don't want to lose that.

To achieve something similar, we should have a RandomCrop transform.

as described in #169:

with a "random crop" type transform where we just take a single window from each item in the dataset, where the x for each item is now the whole spectrogram instead of the current abstraction where x is a window

Importantly we will need the ability to make RandomCrap window size aware. We want it to crop but in such a way that an individual sample in the dataset can be converted to a set of consecutive non-overlapping windows. Part of the way to achieve this is with cropping. Need to think more about that.

Other desiderata:

  • [ ] ideally this crop would be "smart" -- for the case of SED one could configure how it crops, e.g. only within silent intervals of a specified duration?
  • [ ] also nice-to-have would be logging what window was used, e.g. through tensorboard
  • [ ] and then being able to re-run an experiment with the exact same windows ... this is "extreme reproducibility" but would be nice to rule out any effect of window choice

NickleDave avatar Jul 29 '22 14:07 NickleDave