Oytun Turk
Oytun Turk
Basic attention mechanisms are not very robust when training with long input/output sequences. This becomes especially problematic if one has long training phrases which may contain long pauses making the...
If you check the generation code for MOL, you'll see that it's much more complicated than RAW. It has to sample from a MOL distribution, apply costly functions such as...
Tacotron and WaveRNN won't work real-time on CPU. You may want to look into LPCNet as an alternate to WaveRNN. However, you'll still need to use an inferior model instead...
Which Google paper are you refering to? On Thu, Jul 4, 2019 at 11:30 AM 1105060120 wrote: > @oytunturk .But this paper from Google > says it supports real-time on...
That paper only discusses the vocoder portion and, yes, with the sparse wavernn model which does heavy weight pruning, it’s possible to run the sample generation from already predicted spectrograms....
Yes, you’ll need a spectrogram generator + a neural vocoder that are both significantly faster than real-time on a CPU. I’d look into models that can be parallelized to use...