WaveRNN icon indicating copy to clipboard operation
WaveRNN copied to clipboard

Network connection diagram

Open nkcdy opened this issue 5 years ago • 5 comments

The picture posted on the first page is a little bit low resolution and the detailed blocks are indistinct. So I redrew a diagram according to the code. Is it the correct understanding to your code? WaveRNN_Block

nkcdy avatar Aug 13 '19 04:08 nkcdy

@nkcdy that's awesome thanks! And yeah, that looks right. By the way, there's a higher res version of my diagram in the assets folder: https://github.com/fatchord/WaveRNN/blob/master/assets/wavernn_alt_model_hrz2.png

fatchord avatar Aug 14 '19 09:08 fatchord

oh, what an awkward.... but never mind, it is always a good start point to draw a block diagram when study new paper or new code.

nkcdy avatar Aug 14 '19 12:08 nkcdy

By the way, what will happen if I train the network with a multi speakers (such as 400 speakers) corpus. Will it help to improve the generalization capability? or it will fail to get convergence?

nkcdy avatar Aug 15 '19 01:08 nkcdy

I've managed to train multi-speaker models (without any speaker embedding) in 9bit RAW/mulaw. I haven't tried training a multi-speaker MOL model yet.

fatchord avatar Aug 15 '19 08:08 fatchord

I've managed to train multi-speaker models (without any speaker embedding) in 9bit RAW/mulaw. I haven't tried training a multi-speaker MOL model yet.

hi how is the sound quality of the speech generated by your multi-speaker model?

linan06kuaishou avatar Nov 30 '21 10:11 linan06kuaishou