dc_tts
dc_tts copied to clipboard
ValueError: not enough values to unpack (expected 3, got 1)
When attempting to run repo.py on my dataset, I get this error:
(Python36) C:\Anaconda3\envs\Python36\dc_tts-master>python prepo.py
2019-12-16 14:05:25.171748: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
Traceback (most recent call last):
File "prepo.py", line 17, in
But when I use prepo.py on the LJ speech dataset, I get no errors. I have meticulously formatted my dataset to exactly resemble the format of the LJ speech dataset down to the exact sample rate of the LJ audio. Yet, I get that error. I have been pulling my hair out for over an hour trying to figure out what's wrong with my dataset, since even the files are literally named the same thing as the original. I know for a fact it's not because I only have 100 samples because I successfully ran prepo.py with only 70 samples in the dataset.
fname, _, text = line.strip().split("|")
this is key line. It expects that line have format like filename1|text 1 2 3|text one two three
if you have only two column dataset you can omit _
and use fname, text = line.strip().split("|")
@ssnake I thought I set my dataset up like that, but maybe I missed something. There were multiple times when the speaker said "a few thousand" so would I write that like "a few 1,000" where text 1 2 3
would go?
My dataset including the transcript
https://github.com/Kyubyong/dc_tts/files/3973689/Michael_Stevens-Dataset.zip
Look at error again
fname, _, text = line.strip().split("|")
ValueError: not enough values to unpack (expected 3, got 1)
it says that split results in array of one element, but should have 3. It means line may have wrong format. This is a root of your issue. Examine line value before it come to that line
I struggled with the same error for hours... then I found out deep in my csv file there was a ! instead of | that was messing up everything