nguyenlm
nguyenlm
@johnfelipe I think you need to install the `libsndfile1` back-end for the `soundfile` library via: > `sudo apt-get install libsndfile1` Check more details at [Can't import soundfile python](https://stackoverflow.com/questions/55086834/cant-import-soundfile-python)
Hi @mayfool, can you show the mel-spectrogram image which has a single frequency line ?
Yes, you can ! https://github.com/ming024/FastSpeech2/blob/d4e79eb52e8b01d24703b2dfc0385544092958f3/model/fastspeech2.py#L43-L58 Set the `d_targets` to your custom durations (the default value is None and model will predict the durations)
@hypnaceae The training data was a great example for you. When training you will push the ground truth value of `duration`, `pitch` and `energy` to `d_targets`, `p_targets` and `e_targets`. So...
Check this issue: https://github.com/ming024/FastSpeech2/issues/64
Hi @Moonmore Please check the training voice, it should be 22050Hz, 1 channel (mono) and 16 bit-depth.
Hi @phamlehuy53, in textgrid files generated by MFA tool, all punctuations will be modeled by `sp`.
Because this repo depends on the MFA tool to get the duration of each phoneme, so the way to modeling punctuation will be affected by MFA, so I think you...
@EuphoriaCelestial In the MFA they don't use the punctuation to predict the silence, they will predict the silence by unsupervised manner. You can see your TextGrid files only contains the...
Did you solve the problem ? If not, please share the full trace logs, so me or someone can help you.