Yuancheng0625
Yuancheng0625
It has some differences for the data processing for NS2 between other TTS. We will update the data processing section as soon as possible.
Hi, we will update a new checkpoint and data processing pipeline on a large dataset (> 1 w hours) in about two weeks. Now, we only use libritts to train...
Hi, you can split parts of the training files as test and dev files. valid json: https://drive.google.com/file/d/18wXIJjO8RgLnaj5e3hcYkyOHGtBRtv8y/view?usp=drive_link train json: https://drive.google.com/file/d/1bquMJRyQ9F1In0w_seLEma0GzJFqfzNj/view?usp=drive_link
Hi, the detail can be found in https://github.com/open-mmlab/Amphion/blob/main/egs/tta/autoencoderkl/exp_config_latent_4_10_78.json
Hi valid json: https://drive.google.com/file/d/18wXIJjO8RgLnaj5e3hcYkyOHGtBRtv8y/view?usp=drive_link train json: https://drive.google.com/file/d/1bquMJRyQ9F1In0w_seLEma0GzJFqfzNj/view?usp=drive_link
This part of the parameters is used to predict the speaker id during the training process and is not used during inference. Please ignore it.
Hi, there is no difference between FACodecEncoder and FACodecEncoderV2, the difference between FACodecDecoder and FACodecDecoderV2 is that the prosody part of FACodecDecoderV2 using pitch shift wavform to achieve better disentanglement...
Hi, you need padding your wav length to multiples of 200 (hopsize)
Hi, which checkpoint are you using? You can follow: ```python from Amphion.models.codec.ns3_codec import FACodecEncoderV2, FACodecDecoderV2 # Same parameters as FACodecEncoder/FACodecDecoder fa_encoder_v2 = FACodecEncoderV2(...) fa_decoder_v2 = FACodecDecoderV2(...) encoder_v2_ckpt = hf_hub_download(repo_id="amphion/naturalspeech3_facodec", filename="ns3_facodec_encoder_v2.bin")...
Hi, since our model is trained on 16KHz English data, vc performance in other languages may not be as good as shown on the demo page.