DiffSinger
DiffSinger copied to clipboard
decoder part in e2e trainning using opencpop dataset
In the e2e trainning mode of opencpop, skip_decoder is true and the decoder part is not trainned at all, right? But in the inference, you still use run_decoder to get mel_out and use it as a start for q_sample, right? Why run_decoder can also used here?
Is that why you use k=60 in cascade mode but k=1000 in e2e mode?