FastSpeech2 icon indicating copy to clipboard operation
FastSpeech2 copied to clipboard

About fine-tuning issues.

Open ltydd opened this issue 1 year ago • 4 comments

I intend to fine-tune my own dataset based on the AISHELL3 model, but my dataset only includes 6 speakers, while AISHELL3 has 218. I encountered an error message of size mismatch when loading the model. Is there anyone who can help solve this problem?

ltydd avatar Jun 04 '23 11:06 ltydd

I solved this by removing the speaker_embedding.weight when loading the model. del ckpt['model']['speaker_emb.weight'] in utils/model.py, line 20

Hide-A-Pumpkin avatar Jun 06 '23 06:06 Hide-A-Pumpkin

我通过在加载模型时删除 speaker_embedding.weight 来解决此问题。在 utils/model.py 中,第 20 行del ckpt['model']['speaker_emb.weight']

I have already deleted "speaker_emb.weight", but the error still occurs: “RuntimeError: The size of tensor a (218) must match the size of tensor b (6) at non-singleton dimension 0”.

ltydd avatar Jun 12 '23 01:06 ltydd

我通过在加载模型时删除 speaker_embedding.weight 来解决此问题。在 utils/model.py 中,第 20 行del ckpt['model']['speaker_emb.weight']

I have already deleted "speaker_emb.weight", but the error still occurs: “RuntimeError: The size of tensor a (218) must match the size of tensor b (6) at non-singleton dimension 0”.

I forgot whether I met this error. Have you tried model.load_state_dict(ckpt["model"], strict=False) ?

Hide-A-Pumpkin avatar Jun 15 '23 09:06 Hide-A-Pumpkin

我通过在加载模型时删除 speaker_embedding.weight 来解决此问题。在 utils/model.py 中,第 20 行del ckpt['model']['speaker_emb.weight']

I have already deleted "speaker_emb.weight", but the error still occurs: “RuntimeError: The size of tensor a (218) must match the size of tensor b (6) at non-singleton dimension 0”.

I forgot whether I met this error. Have you tried model.load_state_dict(ckpt["model"], strict=False) ?

Hi. I also encountered this issue. Have you already solved it?

Abel1802 avatar Dec 08 '23 13:12 Abel1802