tts-arabic-pytorch icon indicating copy to clipboard operation
tts-arabic-pytorch copied to clipboard

Make the training reproducible

Open rodfersou opened this issue 1 year ago • 1 comments

Great work so far!

We are trying to reproduce the training with more dialects:

  • As a first step, we plan to change the voice of the wave files based on the original corpus set of waves. (I'm running Coqui to convert the voice to another style of voice)
  • Later on, we plan to use our own set of wave files with the new dialect

The issues I found so far in the first step:

  • There is no documentation about running the scripts/extract_f0.py part to create the data/pitch_dict.pt file; it would be good to add this to the docs. I learned it by reading some code.
  • After this part, when I first run the train.py file with my custom configuration, I get an error that the ArabDataset class is not being called correctly; it is passing a cache parameter that doesn't exist in the actual implementation image
  • I removed the parameter to try to brute force the results, but it led to another error:
Traceback (most recent call last):
  File "tts-arabic-pytorch/train.py", line 205, in <module>
    main()
  File "tts-arabic-pytorch/train.py", line 157, in main
    train_loader = DataLoader(train_dataset,
  File "python3.9/site-packages/torch/utils/data/dataloader.py", line 376, in __init__
    sampler = RandomSampler(dataset, generator=generator)  # type: ignore[arg-type]
  File "python3.9/site-packages/torch/utils/data/sampler.py", line 164, in __init__
    raise ValueError(
ValueError: num_samples should be a positive integer value, but got num_samples=0

Am I in the right path? Can you please double-check if the tools in the master branch are the last version used to train the data?

Also, I would like to know if I need to use Python 3.9 for this process; the script's dependencies did not work with newer versions.

rodfersou avatar Nov 05 '24 11:11 rodfersou