piper
piper copied to clipboard
preprocess error
(.venv) roxblox@PCRoxBlox:~/piper/src/python$ python3 -m piper_train.preprocess --language en --input-dir ~/piper/GLaDOS-Dataset --output-dir ~/piper/my-training --dataset-format ljspeech --single-speaker --sample-rate 22050
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/roxblox/piper/src/python/piper_train/preprocess.py", line 502, in
I am on WSL from windows 10.
Hi @RoxBlox3, It seems that your transcription is not correct. Are you sure the transcription syntax is audio|text? (in case of LJSpeech)
@rmcpantoja Thank you for the response. I think so ? here's an example of lines but perhaps it's things like ponctuation. I'm quite new to all this so idk.
a2_triple_laser01|Federal regulations require me to warn you that this next test chamber... is looking pretty good. a2_triple_laser02|That's right. The facility is completely operational again. a2_triple_laser03|I think these test chambers look even better than they did before. It was easy, really. You just have to look at things objectively, see what you don't need anymore, and trim out the fat. chellgladoswakeup01|Oh... It's you. chellgladoswakeup04|It's been a long time. How have you been? chellgladoswakeup05|I've been really busy being dead. You know, after you MURDERED ME. chellgladoswakeup06|Okay. Look. We both said a lot of things that you're going to regret. But I think we can put our differences behind us. For science. You monster. epilogue03|Oh thank god, you're alright.
I am having the same issue as well, did you ever find out what was wrong?
Hi @Ac3inSpac3, I had the same issue. the pipes werent registered as delimiter. A short python script can replace those characters. Try
import csv
def process_file_with_pipes_as_delimiter(input_filename):
processed_data = []
with open(input_filename, 'r', encoding='utf-8') as file:
for line in file:
#if line is not empty
if line.strip() == '':
continue
fields = line.strip().split('|')
processed_data.append(fields)
return processed_data
def write_processed_data_to_csv(output_filename, data, delimiter='|'):
with open(output_filename, 'w', encoding='utf-8', newline='') as csvfile:
writer = csv.writer(csvfile, delimiter=delimiter, quotechar='"', quoting=csv.QUOTE_MINIMAL)
for row in data:
writer.writerow(row)
def main():
input_filename = 'metadata.csv' #input file
output_filename = 'output.csv' #output file
processed_data = process_file_with_pipes_as_delimiter(input_filename)
write_processed_data_to_csv(output_filename, processed_data, delimiter='|')
if __name__ == '__main__':
main()
(.venv) roxblox@PCRoxBlox:~/piper/src/python$ python3 -m piper_train.preprocess --language en --input-dir ~/piper/GLaDOS-Dataset --output-dir ~/piper/my-training --dataset-format ljspeech --single-speaker --sample-rate 22050 Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/roxblox/piper/src/python/piper_train/preprocess.py", line 502, in main() File "/home/roxblox/piper/src/python/piper_train/preprocess.py", line 143, in main for utt in make_dataset(args): File "/home/roxblox/piper/src/python/piper_train/preprocess.py", line 423, in ljspeech_dataset assert len(row) >= 2, "Not enough columns" AssertionError: Not enough columns
I am on WSL from windows 10.
did you find solution ?
@RoxBlox3
(.venv) roxblox@PCRoxBlox:~/piper/src/python$ python3 -m piper_train.preprocess --language en --input-dir ~/piper/GLaDOS-Dataset --output-dir ~/piper/my-training --dataset-format ljspeech --single-speaker --sample-rate 22050 Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/roxblox/piper/src/python/piper_train/preprocess.py", line 502, in main() File "/home/roxblox/piper/src/python/piper_train/preprocess.py", line 143, in main for utt in make_dataset(args): File "/home/roxblox/piper/src/python/piper_train/preprocess.py", line 423, in ljspeech_dataset assert len(row) >= 2, "Not enough columns" AssertionError: Not enough columns I am on WSL from windows 10.
did you find solution ?
No but it is probably an error with my transcription when i tried with another transcription file it worked in the end i did not finish making a new voice as someone made a much better one than what i could do so i did not continue.
I had the same issue. And i finally find out that's because i use pandas
to delete some columns of the csv file, then i forgot to set the delimiter to '|', which became default comma