Aaron (Yinghao) Li comments

Results 110 comments of


                                            Aaron (Yinghao) Li

How to disentangle style and speaker information?

@CONGLUONG12 Probably yes, if speaker A has samples in the training set with similar emotions, otherwise it might not work.

My recommendation results are not true

I don't understand your question. What do you mean by does not match the actual sound?

Predicted phase not in range [-pi .. pi], but in range [-1 .. 1]

This is insanely weird. I have tried to train it by multiplying the phase by torch.pi, but it fails to converge, while using the range from -1 to 1 works...

Error Message: RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (1024, 1024) at dimension 2 of input [1, 65621, 2]

Which line did you add? Does this error only happen when you have your extra line?

Fix Requirements, Add Colab

Thanks for fixing the requirements. However, your Colab notebook doesn't really work because you didn't actually download the pre-trained models. Instead, you copied them from your Google Drive. You can...

mandrain support?

I did try training for other languages including Mandarin, Japanese, Hindi etc., though it requires a few changes: 1. You need to phonemize Chinese into IPAs. You can use either...

mandrain support?

For Japanese, you can do the same thing: The conversion table from kana to IPA is the following (again phonemizer doesn't work for me). ```python kana_mapper = OrderedDict([ ("ゔぁ","bˈa"), ("ゔぃ","bˈi"),...

mandrain support?

@c9412600 That was a typo that should not be included, I have fixed it. 155 is the speaker id (never used during training, just for clarification), and X means no...

mandrain support?

@CONGLUONG12 I don't think there is any change needed for Vietnamese. You only need to find a conversion table between chu quoc ngu and IPA (maybe phonemizer works for this...

mandrain support?

@yihuitang You need to code it yourself because the meldataset.py was written for English support only. I have provided the conversion table, so it should not be difficult for you...