parrot icon indicating copy to clipboard operation
parrot copied to clipboard

fuel datasets

Open dribnet opened this issue 7 years ago • 14 comments

The referenced fuel datasets ['arctic', 'blizzard', 'dimex', 'librispeech', 'pavoque', 'vctk'] are not in the fuel distribution. Are there standard converters for any of these already in other projects?

dribnet avatar Feb 22 '17 00:02 dribnet

Hey! Thanks for your interest.

Unfortunately, some of the datasets are not available publicly (like blizzard). For the others, we plan to release a preprocessed version so people can use them. We have a rough series of instructions to preprocess but it requires installing quite a few libraries so I'm not sure that you'd like to go that way. Let me know what would you prefer.

Right now, we are working on finishing our ICML submission. After this, (probably this weekend,) we will be more free to shape up the code and data. Also, we should release some pretrained models for everyone to explore.

sotelo avatar Feb 22 '17 02:02 sotelo

i want to know how to train with utf8 processing and how to train Arctic data .

slbinilkumar avatar Feb 23 '17 05:02 slbinilkumar

@sotelo, is this the way you preprocess the data for this project?: https://github.com/sotelo/world.py

Thank you for your work!!

Zeta36 avatar Feb 23 '17 13:02 Zeta36

@Zeta36 Hola! No, it's not like that. We will describe how we do it soon. With the ICML deadline coming, we are finishing the paper but should be ready to help others with replication afterwards.

sotelo avatar Feb 23 '17 13:02 sotelo

Hi @sotelo this is great, where can I find your email would definitely like to keep up with your progress.

AdamMiltonBarker avatar Feb 23 '17 14:02 AdamMiltonBarker

Hello, @sotelo. Any news about your project? (I 've not seen any update in your website from a time now)

By the way, you said to @dribnet: "Unfortunately, some of the datasets are not available publicly (like blizzard). For the others, we plan to release a preprocessed version so people can use them."

Are you finally going to release this or explain al least how to do this preprocces step?

Thanks a lot!!

Zeta36 avatar Mar 05 '17 08:03 Zeta36

Hello, @sotelo.

"So, we're currently in the process of doing this. It's a bit messy because the data processing requires installing a few C libraries. Now, we're deliberating whether we should proceed with wrappers (basically updating my old world.py repo) or we just should point people to the instructions on how to do the processing themselves."

It would be wonderful to have any of the two posibilites. No hurry anyway, we will be waiting :).

Regards!!

Zeta36 avatar Mar 07 '17 20:03 Zeta36

Hello, @sotelo. My data shapes printed by datasets.py: features shape: (1001, 500, 67) features dtype: float64 features_mask shape: (1001, 500) features_mask dtype: float64 labels shape: (500, 1914) labels dtype: int32 labels_mask shape: (500, 1914) labels_mask dtype: float64

It's correct?(seq_size is 1000, batch_size is 500,feature_dim is 67)

Why labels and labels_mask not do _transpose?

Regards!!

dp-aixball avatar Mar 30 '17 14:03 dp-aixball

@sotelo The features include 60 mgc,5 bap,1 lf0 and 1 v/u, 5 ms / frame. labels are pure phones index seqs(label_type use unaligned_phoneme in code).not use raw_audio process now. audios are a bit long,one includes 1 to 5 sentences,I will split to 1 sentence latter,is it necessary?

Your seq_size=50, means to 250ms? so short. And I can't find SegmentSequence process for unaligned_phonemes in your codes, why? I just want to train mapping from unaligned_phonemes to vocoder features, What should I do?

Thanks!

dp-aixball avatar Mar 31 '17 03:03 dp-aixball

@sotelo

How can I understand them:'full_labels', 'phonemes', 'unaligned_phonemes', 'text'. Thanks!

dp-aixball avatar Apr 01 '17 11:04 dp-aixball

Yes, I using unaligned phonemes and have trained one model, MSE from 150 to 6.1, but the sound over vocoder is not right, I'am checking...

dp-aixball avatar Apr 01 '17 13:04 dp-aixball

Any updates on this? I wanted to do some experiments with VCTK, but couldn't figure out how to preprocess the data.

reuben avatar Jun 21 '17 01:06 reuben

Hey, @sotelo is there a way to use parrrot for finding phoneme boundaries? I am working on concatenative synthesis and it would be a very nice feature to have.

caoba1 avatar Jun 23 '17 13:06 caoba1

For the others, we plan to release a preprocessed version so people can use them. We have a rough series of instructions to preprocess but it requires installing quite a few libraries so I'm not sure that you'd like to go that way. Let me know what would you prefer.

Hey @sotelo, is there any new information regarding the preprocessed version or could you upload/point to the mentioned rough series of instructions? Would be very helpful.

Thanks in advance!

ystrehlow avatar Aug 17 '17 11:08 ystrehlow