Yuancheng0625 comments

Results 35 comments of


                                            Yuancheng0625

Unable to run training script of Natural Speech 2

It has some differences for the data processing for NS2 between other TTS. We will update the data processing section as soon as possible.

Unable to run training script of Natural Speech 2

Hi, we will update a new checkpoint and data processing pipeline on a large dataset (> 1 w hours) in about two weeks. Now, we only use libritts to train...

[Feature]: Audiocap dataset dev and test files

Hi, you can split parts of the training files as test and dev files. valid json: https://drive.google.com/file/d/18wXIJjO8RgLnaj5e3hcYkyOHGtBRtv8y/view?usp=drive_link train json: https://drive.google.com/file/d/1bquMJRyQ9F1In0w_seLEma0GzJFqfzNj/view?usp=drive_link

[Help]: Is there any loss that linearly correlate to performance of TTA autoencoder?

Hi, the detail can be found in https://github.com/open-mmlab/Amphion/blob/main/egs/tta/autoencoderkl/exp_config_latent_4_10_78.json

[Help]: Question of Data Preparation for TTA

Hi valid json: https://drive.google.com/file/d/18wXIJjO8RgLnaj5e3hcYkyOHGtBRtv8y/view?usp=drive_link train json: https://drive.google.com/file/d/1bquMJRyQ9F1In0w_seLEma0GzJFqfzNj/view?usp=drive_link

[Help]: Questions about FACodec's Parameter

This part of the parameters is used to predict the speaker id during the training process and is not used during inference. Please ignore it.

[Help]: The difference between the FAcodec pretrained model "FACodecEncoderV2" vs "FACodecEncoder"

Hi, there is no difference between FACodecEncoder and FACodecEncoderV2, the difference between FACodecDecoder and FACodecDecoderV2 is that the prosody part of FACodecDecoderV2 using pitch shift wavform to achieve better disentanglement...

[BUG]: the lengths of the features after FACodecEncoderV2 is not match

Hi, you need padding your wav length to multiples of 200 (hopsize)

[Help]: FACodec. How to recreate demo examples for voice conversion?

Hi, which checkpoint are you using? You can follow: ```python from Amphion.models.codec.ns3_codec import FACodecEncoderV2, FACodecDecoderV2 # Same parameters as FACodecEncoder/FACodecDecoder fa_encoder_v2 = FACodecEncoderV2(...) fa_decoder_v2 = FACodecDecoderV2(...) encoder_v2_ckpt = hf_hub_download(repo_id="amphion/naturalspeech3_facodec", filename="ns3_facodec_encoder_v2.bin")...

[Help]: FACodec. How to recreate demo examples for voice conversion?

Hi, since our model is trained on 16KHz English data, vc performance in other languages may not be as good as shown on the demo page.