Cross-Speaker-Emotion-Transfer icon indicating copy to clipboard operation
Cross-Speaker-Emotion-Transfer copied to clipboard

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

Results 10 Cross-Speaker-Emotion-Transfer issues
Sort by recently updated
recently updated
newest added

Hello author, Firstly, thank you for giving this repo, it is really nice. I have a question that: 1. I download CMU data with single person with 100 audios and...

corpus_path: "output/ckpt/RAVDESS" raw_path: "output/ckpt/RAVDESS/450000.pth/data"

I'm trying to run `synthesize` with the pretrained model, like such: ```bash python3 synthesize.py --text "This sentence is a test" --speaker_id Actor_01 --emotion_id neutral --restore_step 450000 --dataset RAVDESS --mode single...

Hi, I am facing the following issue while synthesizing using pretrained model. Removing weight norm... Traceback (most recent call last): File "synthesize.py", line 234, in )) if load_spker_embed else None...

This project is great, how to train Mandarin? It seems that Mandarin is not supported, and there is no processing for Mandarin in the code.

I am curious as to why you used HiFi-GAN or MelGAN rather than the vocoder (WaveRNN) described in the paper. 안녕하세요, 코드를 공유해주셔서 감사합니다. 저는 본 코드에서 논문에 기재되어 있는...

I download model from https://drive.google.com/drive/folders/1QszdJC7dzBrQHntiLxYcG8ewczvoK4q1, and test inference with command as bellow, python3 synthesize.py --text "Hello!" --speaker_id Actor_22 --emotion_id sad --restore_step 450000 --mode single --dataset RAVDESS the output audio obviously...

The current implementation is not trained in a semi-supervised way due to the small dataset size. But it can be easily activated by specifying target speakers and passing no emotion...

Hi, thank you for open source the wonderful work ! I followed your instructions 1) install `lightconv_cuda`, 2) download the [checkpoint](https://drive.google.com/drive/folders/1QszdJC7dzBrQHntiLxYcG8ewczvoK4q1), 3) download the [speaker embedding npy](https://drive.google.com/drive/folders/1a4YW2UWdlF9RTqG_phv_VbRjyEcAld7t). However, the generated...

한국어 데이터를 학습하는 것을 시도중에 있습니다. 혹시 한국어 전처리가 포함된 코드의 공유가 가능할까요?