Aaron (Yinghao) Li

Results 110 comments of Aaron (Yinghao) Li
trafficstars

@panxin801 I'm currently working on singing conversion using this model with some further modifications for better performance. I may submit my work to INTERSPEECH next year.

@mraj96 Sorry, I mean INTERSPEECH next year so it'll be 2023.

It is in eval model all the way long: https://github.com/yl4579/StarGANv2-VC/blob/main/train.py#L85

I think you are right. That's probably a mistake. What was the error you got?

I believe there is no difference between the train and eval mode for the ASR model, at least the part we are using here. The part we are using (the...

I think you are right, though the train/eval mode does not affect group norm. It does affect dropout though, so you can set dropout to 0 without changing the train/eval...

Thanks for letting me know. Can you make a pull request to modify these things for this repo? Or maybe indicate where the problem is, and I can make the...

Can you be more specific? Do you mean unseen speakers? Unseen samples? What kind of input that is not in training data?

You can define the domains in terms of emotions instead of speakers. This way you can preserve the speakers but only convert emotions.

@CONGLUONG12 It should be of multiple speakers. You can refer to https://arxiv.org/pdf/2302.10536.pdf for more details. This is a good example of how to modify StarGANv2-VC for emotion conversion.