acetylSv comments

Results 7 comments of


                                            acetylSv

questions about the datasets.

Hi, I plotted character lengths of each line in transcription into histogram and got this [plot](http://speech.ee.ntu.edu.tw/~acetylsv/BZ_char_len.png). So I decided to discard sentences whose character length > 300.

questions about the datasets.

I used only the segmented part of Blizzard-2013 dataset which contains 9742 files with about 20 hrs. So I'm not sure what will happen if switching to the bigger one....

Invalid reference audio?

Maybe the pre-trained model is not converged to a promising point. What kinds of different reference audio clips have you tried?

how to use pre-trained models

Did you specify the pre-trained models path and the infer input text file path in hyperparams.py?

How to get the "mceps.hdf5" file?

I used[ Python-Wrapper-for-World-Vocoder](https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder) to first extract SPs, APs and f0s acoustic features, and then used [pysptk](https://github.com/r9y9/pysptk), to transfer SPs to mceps. Hope this two repos solve your problem.

How to get the "mceps.hdf5" file?

> Can you kindly share your script that extracts sp, ap, f0 and mceps? Hi, the functions I used to extract those features and synthesize them back are [here](https://github.com/acetylSv/cycle_gan_vc/blob/35c1609bffd298eb8179a00bb3f396b17f964f94/utils.py#L77). Hyperparameters...

Do you have any demo wav files to compare?

Hi, I've uploaded some samples in 'results/' directory. As you can listen, neither could I get the good result as the author's own demo. I'm not sure is this mainly...