gst-tacotron icon indicating copy to clipboard operation
gst-tacotron copied to clipboard

preprocessing the training data

Open marymirzaei opened this issue 6 years ago • 2 comments

Thank you very much for your nice work. I have a problem with preprocessing the training data. The transcript file for Blizzard2013 segmented data is a file named prompts.gui which can be found here: https://www.dropbox.com/s/6ugwnbqgwlfvxvl/prompts.gui?dl=0 I was wondering how the metdata.train file should look like. It seems that I need to clean up the attached file to be used for training and match the criteria. Is it possible to upload your cleaned up 'metadata-train' file, the converter of prompt.gui to metadata-train, or the desired format of the metadata.train file?

marymirzaei avatar May 09 '18 06:05 marymirzaei

Hi, I just simply extract the text from the prompts.gui, ignoring other information like prosody.

You can get the file format from the attachment. metadata.zip

syang1993 avatar May 09 '18 06:05 syang1993

Hi, I just simply extract the text from the prompts.gui, ignoring other information like prosody.

You can get the file format from the attachment. metadata.zip

Do you know what the other information is? I can't understand what the 3rd line in prompt.gui mean. Following is an example

CA-BB-01-01 Black Beauty @ : # the Autobiography @ of a Horse . # B L 62iHfN KcF _ B y13iHfW ^ T Y2iLfN @ : || _ DH Y2iLfN cYa _ 33iHfN ^ T N42iLfN ^ B 6y2iLfN cY ^ 42iHfN ^ GcS R N41iLfN ^ F Y1iLfN cYa @ _ N41iLfN VcD _ N41iLfN _ H 32iHfW R ScT . ||

CruelPaw avatar Mar 18 '20 11:03 CruelPaw