thesis icon indicating copy to clipboard operation
thesis copied to clipboard

ETH Zürich MSc Thesis: Accelerating Neural Audio Synthesis

Results 7 thesis issues
Sort by recently updated
recently updated
newest added

DDSP and NEWT papers don't mention it, but RAVE does: "We use dequantization, random crop and allpass filters with random coefficients as our data augmentation strategy."

training
low priority

Neural audio synthesis models are hard to compare automatically - a survey will be needed to show that the quality didn't decrease through our speedups

TorchScript's performance seemed not to improve when forcing it to use both vCPUs, and DeepSparse [explicitly chooses not to use both](https://github.com/neuralmagic/deepsparse/issues/459). What if we change the number of CPUs?

low priority

RAVE uses this, but the other models could too.

training
low priority

RAVE (and [Multiband MelGAN](https://arxiv.org/pdf/2005.05106.pdf) too) feeds the raw single-band waveform to the discriminator. Wouldn't it make sense to use multiband decomposition for the discriminator as well?