adaptive_voice_conversion icon indicating copy to clipboard operation
adaptive_voice_conversion copied to clipboard

About the implementation in the code

Open wyp19930313 opened this issue 4 years ago • 2 comments

Hello, I have two questions while reading your code. Could you please help me answer them if you are free.

  1. Why is the forward propagation of training and prediction different? ` def forward(self, x):
    emb = self.speaker_encoder(x)
    mu, log_sigma = self.content_encoder(x) eps = log_sigma.new(*log_sigma.size()).normal(0, 1)_ dec = self.decoder(mu + torch.exp(log_sigma / 2) * eps, emb) return mu, log_sigma, emb, dec

    def inference(self, x, x_cond): emb = self.speaker_encoder(x_cond) mu, _ = self.content_encoder(x) dec = self.decoder(mu, emb) return dec `

  2. How is the loss function of KL divergence calculated? loss_kl = 0.5 * torch.mean(torch.exp(log_sigma) + mu ** 2 - 1 - log_sigma) https://github.com/jjery2243542/adaptive_voice_conversion/blob/68c33518495d7de404a0f1fdce95e718db86c91b/solver.py#L86

wyp19930313 avatar Sep 11 '20 09:09 wyp19930313

hi, @wyp19930313 I have the same questions as both of your questions. Have you found the answers?

sbkim052 avatar Nov 02 '20 06:11 sbkim052

@sbkim052 The first question is determined by the principle of VAE. The second question is still not understood.

wyp19930313 avatar Nov 24 '20 07:11 wyp19930313