FastSpeech2
                                
                                 FastSpeech2 copied to clipboard
                                
                                    FastSpeech2 copied to clipboard
                            
                            
                            
                        Mismatch tensor size when training
I encountered this error when trying to train a new model for a different language. Lexicon, TextGrid, .lab files is from MFA Can you please take a look at this issue @ming024
Traceback (most recent call last):                                                              | 0/212 [00:00<?, ?it/s]
  File "train.py", line 198, in <module>
    main(args, configs)
  File "train.py", line 82, in main
    output = model(*(batch[2:]))
  File "/fastspeech2/fs2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastspeech2/fs2-env/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 159, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/fastspeech2/fs2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastspeech2/FastSpeech2/model/fastspeech2.py", line 91, in forward
    d_control,
  File "/fastspeech2/fs2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastspeech2/FastSpeech2/model/modules.py", line 121, in forward
    x = x + pitch_embedding
RuntimeError: The size of tensor a (47) must match the size of tensor b (101) at non-singleton dimension 1
Training:   0%|                                                                 | 1/900000 [00:00<122:50:30,  2.04it/s]
Epoch 1:   0%|                                                                                 | 0/212 [00:00<?, ?it/s]
p/s: I tried run the same train command without changing anything few time, each time the size of tensor a and tensor b is different
@EuphoriaCelestial Could you please print out the values of these tensors, or give more information?
This looks like the same error as https://github.com/ming024/FastSpeech2/issues/66
This looks like the same error as #66
oh yes, it was the same error
I just have check my preprocess.yaml , everything is running at phoneme level
If not, maybe you should check that whether the length mismatch is caused by incorrect padding or not.
I am not sure how to check this
@EuphoriaCelestial Could you please print out the values of these tensors, or give more information?
yes I can provide any information needed, where can I find those tensors to print its value?
the best solution is to run "mfa train xxx" command to generate textgrid files again and then run preprocess.py(even run "mfa align xxx" command using align model trained in other dataset with same lexicon may also not work).
the best solution is to run "mfa train xxx" command to generate textgrid files again and then run preprocess.py(even run "mfa align xxx" command using align model trained in other dataset with same lexicon may also not work).
it mean generate lexicon, textgrid, lab files all over again?
I just found out I have some file listed in unaligned.txt in TextGrid folder, should I delete them?
I just found out I have some file listed in
unaligned.txtin TextGrid folder, should I delete them?
it doesnt matter, just keep it. lexicon and lab no need to change, just use "mfa train xxx" to generate textgrid again, and then generate pitch/mel/energy/duration file again
it doesnt matter, just keep it.
okay
lexicon and lab no need to change, just use "mfa train xxx" to generate textgrid again, and then generate pitch/mel/energy/duration file again
if we dont change anything, what is the point of running that command again? the generated textgrid, pitch/mel/energy/dur will be the same as before right? or something? I used pretrain MFA models
it doesnt matter, just keep it.
okay
lexicon and lab no need to change, just use "mfa train xxx" to generate textgrid again, and then generate pitch/mel/energy/duration file again
if we dont change anything, what is the point of running that command again? the generated textgrid, pitch/mel/energy/dur will be the same as before right? or something? I used pretrain MFA models
I havent find out reason yet, but using mfa models trained from another lexicon or same lexicon but another dataset indeed cause this error
it doesnt matter, just keep it.
okay
lexicon and lab no need to change, just use "mfa train xxx" to generate textgrid again, and then generate pitch/mel/energy/duration file again
if we dont change anything, what is the point of running that command again? the generated textgrid, pitch/mel/energy/dur will be the same as before right? or something? I used pretrain MFA models
I havent find out reason yet, but using mfa models trained from another lexicon or same lexicon but another dataset indeed cause this error
wait, let me re-call all steps I have done:
I used mfa_g2p to generate my own lexicon from my dataset (pretrain g2p model on MFA website is being used)
then I used that lexicon to generate TextGrid, using mfa_train command
and then prepare_align.py to generate raw_data folder
after that I run preprocess.py
finally, train.py and got this error
is that correct steps? if so, what step should I do again now?
it doesnt matter, just keep it.
okay
lexicon and lab no need to change, just use "mfa train xxx" to generate textgrid again, and then generate pitch/mel/energy/duration file again
if we dont change anything, what is the point of running that command again? the generated textgrid, pitch/mel/energy/dur will be the same as before right? or something? I used pretrain MFA models
I havent find out reason yet, but using mfa models trained from another lexicon or same lexicon but another dataset indeed cause this error
wait, let me re-call all steps I have done: I used
mfa_g2pto generate my own lexicon from my dataset (pretrain g2p model on MFA website is being used) then I used that lexicon to generate TextGrid, usingmfa_traincommand and thenprepare_align.pyto generate raw_data folder after that I runpreprocess.pyfinally,train.pyand got this erroris that correct steps? if so, what step should I do again now?
oh I don't have g2p step since I just use author's mandarin lexicon.
As for your situation, maybe another reason cause the error. You train another language, do you add symbols file in text/yourlanguage.py?
You train another language, do you add symbols file in text/yourlanguage.py?
I tried changing symbols list; but there is some part I dont fully understand, like arpabet and pinyin
So I just changed _letters list
Are you using your own symbols?
Check issue #66 because I had the same problem of shape mismatch.  The mfa part is good, the problem is that some processing is incorrectly applied to the text in the dataset.
Are you using your own symbols? Check issue #66 because I had the same problem of shape mismatch. The
mfapart is good, the problem is that some processing is incorrectly applied to the text in thedataset.
yes I am using my own symbols, but so far I only changed the _letters list in /text/symbols.py, I dont know how should I change arpabet list
I have commented on your issue, please answer those question