SC-CNN icon indicating copy to clipboard operation
SC-CNN copied to clipboard

About additional loss

Open p0p4k opened this issue 1 year ago • 3 comments

Hello, nice work. I have a question. Q) how about adding an extra loss at the end of generation to match the spk_enc of reference wav and generated wav? Because I do not see meta-stylespeech's discriminator being used here? (am i missing it somewhere?)

....
s_ref = self.spk_enc(y.transpose(1,2), (y_mask==0).squeeze(1))
....
## freeze spk_enc
s_out = self.spk_enc(y_out.transpose(1,2), (y_out_mask==0).squeeze(1))
# then cosine dist b/w s_ref and s_out
## unfreeze spk_enc

Thanks.

p0p4k avatar Nov 07 '23 09:11 p0p4k

This repo has built on Pure VITS & StyleSpeech, in order to verify the SC-CNN technique. Do you mean SCL loss in YourTTS?

hcy71o avatar Nov 08 '23 01:11 hcy71o

Is there any stylespeechloss in sc-cnn?

p0p4k avatar Nov 10 '23 09:11 p0p4k

Loss terms related to meta-learning are excluded in this repo.

hcy71o avatar Dec 29 '23 08:12 hcy71o