tts-arabic-pytorch
tts-arabic-pytorch copied to clipboard
Make the training reproducible
Great work so far!
We are trying to reproduce the training with more dialects:
- As a first step, we plan to change the voice of the wave files based on the original corpus set of waves. (I'm running Coqui to convert the voice to another style of voice)
- Later on, we plan to use our own set of wave files with the new dialect
The issues I found so far in the first step:
- There is no documentation about running the
scripts/extract_f0.pypart to create thedata/pitch_dict.ptfile; it would be good to add this to the docs. I learned it by reading some code. - After this part, when I first run the
train.pyfile with my custom configuration, I get an error that theArabDatasetclass is not being called correctly; it is passing acacheparameter that doesn't exist in the actual implementation - I removed the parameter to try to brute force the results, but it led to another error:
Traceback (most recent call last):
File "tts-arabic-pytorch/train.py", line 205, in <module>
main()
File "tts-arabic-pytorch/train.py", line 157, in main
train_loader = DataLoader(train_dataset,
File "python3.9/site-packages/torch/utils/data/dataloader.py", line 376, in __init__
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "python3.9/site-packages/torch/utils/data/sampler.py", line 164, in __init__
raise ValueError(
ValueError: num_samples should be a positive integer value, but got num_samples=0
Am I in the right path? Can you please double-check if the tools in the master branch are the last version used to train the data?
Also, I would like to know if I need to use Python 3.9 for this process; the script's dependencies did not work with newer versions.