ForwardTacotron icon indicating copy to clipboard operation
ForwardTacotron copied to clipboard

preprocess.py: list index out of range

Open rmcpantoja opened this issue 2 years ago • 5 comments

Hi @cschaefer26 After the espeak backend acceleration feature was attached, I downloaded the repository with the latest commit to preprocess a small test dataset (I actually tested a dataset of 10000 audios too which I want to train to make a pretrained model for smaller datasets) and I always have this error. I debugged the "cleaned_texts" array and the array is completely empty and doesn't add any elements while preprocessing. Instead, with the commit before the espeak feature is the opposite, it processes an audio along with its text, adds it to that array, and displays all the phonemiced texts. I've tried many datasets, including in a colab notebook with the current changes from the repository and obviously, it's the same error result. I clarify it in case you have something to do with it. Here is a log: 280 .wav files found in "C:\Users\LENOVO_User\ForwardTacotron\testdataset" Using 280 wav files that are indexed in metafile.

+-------------+-----------+--------+------------+-----------+----------------+ | Sample Rate | Bit Depth | Mu Law | Hop Length | CPU Usage | Num Validation | +-------------+-----------+--------+------------+-----------+----------------+ | 22050 | 9 | True | 256 | 4/8 | 200 | +-------------+-----------+--------+------------+-----------+----------------+

input text to phonemize() is str but it must be list of str █░░░░░░░░░░░░░░░ 1/280 texts: []input text to phonemize() is str but it must be list of str █░░░░░░░░░░░░░░░ 2/280 texts: []input text to phonemize() is str but it must be list of str █░░░░░░░░░░░░░░░ 3/280 texts: []input text to phonemize() is str but it must be list of str


████████████████ 280/280 texts: []Traceback (most recent call last): File "preprocess.py", line 150, in print(f'First val sample: {val_dataset[0][0]}') IndexError: list index out of range

rmcpantoja avatar Nov 01 '22 04:11 rmcpantoja

Hi, this is strange, could you check your version of the phonemizer package? Also you could run the cleaner test to see what the problem might be (probably need to pip install pytest):

python -m pytest tests/test_cleaner.py

cschaefer26 avatar Nov 01 '22 08:11 cschaefer26

Locally I run phonemizer=2.2 without any issues.

cschaefer26 avatar Nov 01 '22 08:11 cschaefer26

@cschaefer26 I'm using the latest phonemizer version (3.2.1) I just did the pytest tests and it's the same error message that appears when each audio is processed. And I discovered something curious that changes from phonemizer 3.0, so I leave you the log so you can check it: ================================================= test session starts ================================================= platform win32 -- Python 3.7.9, pytest-7.1.3, pluggy-1.0.0 rootdir: C:\Users\LENOVO_User\ForwardTacotron plugins: anyio-3.6.1 collected 1 item

tests\test_cleaner.py F [100%]

====================================================== FAILURES ======================================================= __________________________________________ TestCleaner.test_call_happy_path ___________________________________________

self = <tests.test_cleaner.TestCleaner testMethod=test_call_happy_path>

def test_call_happy_path(self) -> None:
    cleaner = Cleaner(cleaner_name='no_cleaners',
                      use_phonemes=True, lang='en-us')
  cleaned = cleaner('hello there!')

tests\test_cleaner.py:11:


utils\text\cleaners.py:82: in call text = self.backend.phonemize(text, strip=True)


self = <phonemizer.backend.espeak.espeak.EspeakBackend object at 0x00000188CDC9A488>, text = 'hello there!' separator = None, strip = True, njobs = 1

def phonemize(self, text: List[str],
              separator: Optional[Separator] = None,
              strip: bool = False,
              njobs: int = 1) -> List[str]:
    """Returns the `text` phonemized for the given language

    Parameters
    ----------
    text: list of str
        The text to be phonemized. Each string in the list
        is considered as a separated line. Each line is considered as a text
        utterance. Any empty utterance will be ignored.

    separator: Separator
        string separators between phonemes, syllables
        and words, default to separator.default_separator. Syllable separator
        is considered only for the festival backend. Word separator is
        ignored by the 'espeak-mbrola' backend.

    strip: bool
        If True, don't output the last word and phone separators
        of a token, default to False.

    njobs : int
        The number of parallel jobs to launch. The input text is
        split in ``njobs`` parts, phonemized on parallel instances of the
        backend and the outputs are finally collapsed.

    Returns
    -------
    phonemized text: list of str
        The input ``text`` phonemized for the given ``language`` and ``backend``.

    Raises
    ------
    RuntimeError
        if something went wrong during the phonemization

    """
    if isinstance(text, str):
        # changed in phonemizer-3.0, warn the user
        raise RuntimeError(
          'input text to phonemize() is str but it must be list of str')

E RuntimeError: input text to phonemize() is str but it must be list of str

..\AppData\Local\Programs\Python\Python37\lib\site-packages\phonemizer\backend\base.py:182: RuntimeError =============================================== short test summary info =============================================== FAILED tests/test_cleaner.py::TestCleaner::test_call_happy_path - RuntimeError: input text to phonemize() is str but ... ================================================== 1 failed in 1.23s ==================================================

rmcpantoja avatar Nov 01 '22 12:11 rmcpantoja

Thx for looking into it. I just merged a fix that should solve the problem (seems that phonemizer >=3.0 changed the interface into expecting a List[str] instead of Union[List, st]).

cschaefer26 avatar Nov 01 '22 14:11 cschaefer26

@cschaefer26 Thank you very much, now it works correctly. I'm thinking of making a public tacotron pretrained model to be able to train smaller datasets, so I think to do this is to increase the training schedule from 40k steps to 80k steps, and when training the smallest dataset I could start from 40k due to that the tacotron model was trained up to 40k. Can this work?

rmcpantoja avatar Nov 09 '22 13:11 rmcpantoja