Gustav Eje Henter
Gustav Eje Henter
Taras is on holiday this week and might not be able to give you an answer for a little while. In the meantime, I can possibly provide some pointers, but...
Issue #8 makes it sounds like you already have gotten started on training your systems, but I will try to answer the question in this issue nonetheless: For TTS, you...
This is a known issue. The so-called forward algorithm used by neural HMM training is less memory efficient than Tacotron 2 training, especially for long utterances with many states (phones...
Just to check, what codebase was used to create the checkpoint that you want to continue training from? This repo (Neural HMM TTS), [Nvidia's Tacotron 2 implementation](https://github.com/NVIDIA/tacotron2), or some other...
OK. And are you trying to continue training on LJ Speech or on another dataset? To use another dataset, I think the phone set (or at least the number of...
The pre-trained model was trained on LJ Speech, which is a US English dataset. In general, pre-training on one language and then fine-tuning on another is not a standard TTS...
Did you try setting `gpus = [0, 1]`?
Are you running on Windows? A quick Google search shows that PyTorch on Windows does not support the NCCL backend for distributed communications. As written in [the `torch.distributed` documentation](https://pytorch.org/docs/stable/distributed.html): >...
> I trained at Neural-HMM from the beginning now, but even after 400k iterations there was no sign of correct speaking. My impression is that you are training on graphemes....
What's the best way to move forward, then? Whether you use neural HMM TTS or Tacotron 2, I think you are likely to get better results if you were to...