Kaizhi Qian
Kaizhi Qian
@liveroomand Yes in our case. but you can design your own speaker encoder or just use onehot embedding
@miaoYuanyuan For other dataset, you need to tune the parameters of the conversion model instead of the parameters of the feature.
@miaoYuanyuan If you change the parameters of features, you will need to retrain the wavenet-vocoder as well.
Please refer to the data preparation code for details
You don't need to train them at the same time.
You can simply replace G with P along with some other minor modifications.
All preprocessing steps are in the code, except trimming silence. But I don't think they will make any fundamental difference. Your loss value looks fine.
The train.pkl is intended for training.
For testing, please refer to this issue #108
the .pkl is not a format, it is just a suffix of the filename. You can name it whatever you like such as .abc, .qaz, or .wsx, etc. To save...