NeuralSVB
NeuralSVB copied to clipboard
How can NSVB generalize to unseen singers?
NSVB is trained on PopBuTFy with 34 speakers. Even with the 30-hour internal singing data as described in the paper in the training of Stage1 , I doubt that this level of data would enable it to generalize to unseen singers. I believe it can only generalize to a similar singer in the training set. I trained Stage 1 with the 50-hour OpenSinger data and 3 other singers, the resulting model can only generalize to a similar singer in the training set, but it can't do the same for a very different singer. Has anyone been able to do a better generalization here?