DiffSinger
DiffSinger copied to clipboard
Inference SVS
Hello! Great job! I would like to know a few things. Interested in SVS (POPCS)
- Can you tell me about inference? What files are used for inferencing? What's the recipe? How did you manage to repeat the melody (notes) if midi is not used?
- Can I perform inference for English? What can I do about it? I understand that the accent will remain Chinese. Are you planning further work for other languages?
1, I think you should read this file to get a better understanding: https://github.com/MoonInTheRiver/DiffSinger/blob/master/docs/README-SVS.md 2, The phoneme dictionary of EN is not the same as that of ZH. Thus the answer is no. You should re-train the model using International Phonetic Alphabet (IPA) or re-train the model on EN datasets.
when inference use phoneme,there is an error "can't convert np.ndarray of type numpy.str_ ........."
we have done exactly that and get this error when running the SVS inference:
Traceback (most recent call last):
File "inference/svs/ds_e2e.py", line 71, in
[72, 256] is from retraining with our own EN dataset. We don't know where this current model is torch.Size([64, 256]) is coming from. any tips?
@michaellin99999 Hello, have you done that job yet?