tfg-voice-conversion
tfg-voice-conversion copied to clipboard
May I pull an user guide to this repo?
MR albert, I love your work so much. but it doesn't have a friendly user guide to use. and....it may have some bugs in these codes? these day I am trying to run these codes but I spent some time on how to use it. May I share my note and could you please spent a little time checking on it so that we could build it more friendly? Here is my note, it seems someting wrong....(On step 6 when i using seq2seq) butI dont know where is the problem.
This project contains 3 solutions for voice conversion.
A MCEP-GMM based solution based on SPTK tools(Bsaeline solution of VCC2016: http://vc-challenge.org/summary.html)
A DNN-LSTM-GRU converting MVF-logf0-Mel-cepstrum
A seq2seq based MVF-logf0-Mel-cepstrum feature extraction conveting solution.
To run MCEP-GMM based solution based on SPTK tools:
edit sptk_vc.sh TRAIN_FILENAME to any files you need to convert
run sptk_vc.sh
the output wav is in data/training/gmm_vc
To run DNN-LSTM-GRU converting:
run lf0_lstm.py mvf_dnn.py mcp_gru.py to train different models to convert different features
[optional]run lf0_post_training.py mcp_post_training.py mvf_post_training.py mvf_plot_curves.py to verify model
run decode_aho.sh to merge the feature to wav
To run seq2seq:
0.[optional]you can get all the training file in ,put them in data/training/
1.apt-get intsall sox ,pip install tensorflow and so on
2.cp do_columns.pl to /usr/local/bin
3.get tfglib (https://github.com/albertaparicio/tfglib),edit seq2seq_datatable.py(maybe bug in para: nb_classes) ,and install
4.source install ahocoder, and add the file $ into your path
5.edit data/test/speakers.list (add more speakers if step 0 was procceed?)
6.run /data/train/seq2seq_align_training.sh and /data/test/seq2seq_align_test.sh
7.run seq2seq.py
(There are some questions....that some file(like file 200007)wasn't extract .lf0.dat file and may throw an error)
8.run seq2seq_decode_prediction.py
Dear HudsonHuang,
This project currently has no user guide because it is unfinished. The same applies to the possible bugs in the code (it would help to tell me what they are anyway). This repository contains the code of my bachelor thesis' code, which is still being developed. Once it is finished (around May-June 2017), it will have a comprehensive user guide explaining how to use the code.
At the time of writing, the DNN-LSTM-GRU model is complete ('baseline' tag), as well as the SPTK model. The seq2seq model is the one under development.
The short guide you have written is useful to give users a fast way to see how to run the finished models. I must tell you, though, that the code in the seq2seq model is bound to radically change (we have been working with Keras, and are about to re-write the model from scratch with TensorFlow), so this part of the guide will change in a short future.
Regarding the error about file 200007, I remember having found a similar issue. I will take a look at it. Meanwhile, I suggest you skip this file.
Last, but not least, I want to thank you for your interest in my project. Stay tuned for the new developments (I hope there will be some interesting results)
Best wishes,
Albert
Dear Albert,
Thank you for your comment. My research is about text-to-speech system with specific people feature, so I have learn a lot from your project, thanks a lot. And I am longing for the completion of the project and willing to give any help I can to build this project together. Last, but not least, I think your project is the most advanced and complete open source DNN-based Voice-Conversion project. Thats great!
Best wishes,
Hudson
@HudsonHuang Hi, sorry for interrupting you, thank you for your user guide which makes me understand Albert's code better, I have just a question about MCEP-GMM based solution, can I use this model to convert my own voice file, I mean that just add a voice file in the correspondent directory and then change TRAIN_FILENAME ?
Best wishes, J.SHI
@JingleiSHI
Yes, I think it would work. Besides, it would be helpful to checkout the code here if any error happened.
Best wishes,
Hudson
@HudsonHuang
Thank you for your response, I hope I didn't bother you with my questions. I have run the program, but when I use a gmm model with -m 20, there will be a det = 0 problem (the inverse matrix can't be calculated), and I don't understand the command in the 73 line: vc -m 2 why m equals 2 but not 32 ? For the number of iteration, is it enough to use 100 iteration ?
Best wishes, J.SHI