Aaron (Yinghao) Li comments

Results 110 comments of


                                            Aaron (Yinghao) Li

trafficstars

For song vc what should I do

@panxin801 I'm currently working on singing conversion using this model with some further modifications for better performance. I may submit my work to INTERSPEECH next year.

For song vc what should I do

@mraj96 Sorry, I mean INTERSPEECH next year so it'll be 2023.

Why is ASR model goes to train mode in the training loop

It is in eval model all the way long: https://github.com/yl4579/StarGANv2-VC/blob/main/train.py#L85

Why is ASR model goes to train mode in the training loop

I think you are right. That's probably a mistake. What was the error you got?

Why is ASR model goes to train mode in the training loop

I believe there is no difference between the train and eval mode for the ASR model, at least the part we are using here. The part we are using (the...

Why is ASR model goes to train mode in the training loop

I think you are right, though the train/eval mode does not affect group norm. It does affect dropout though, so you can set dropout to 0 without changing the train/eval...

Why is ASR model goes to train mode in the training loop

Thanks for letting me know. Can you make a pull request to modify these things for this repo? Or maybe indicate where the problem is, and I can make the...

How to deal with using non training data for inference, and the inference results are not realistic enough to restore

Can you be more specific? Do you mean unseen speakers? Unseen samples? What kind of input that is not in training data?

How to disentangle style and speaker information?

You can define the domains in terms of emotions instead of speakers. This way you can preserve the speakers but only convert emotions.

How to disentangle style and speaker information?

@CONGLUONG12 It should be of multiple speakers. You can refer to https://arxiv.org/pdf/2302.10536.pdf for more details. This is a good example of how to modify StarGANv2-VC for emotion conversion.