Yuancheng0625 comments

Results 35 comments of


                                            Yuancheng0625

Where is the hifigan vocoder of TTA released?

Hi, you can download the vocoder checkpoint from https://huggingface.co/amphion/text_to_audio/tree/main/tta/hifigan_checkpoints

An issue with the preprocessing part of LibriTTS.

If the model is saved as model.safetensors (which means it is not pytorch_model.bin), please use "pip install accelerate==0.24.1"

CosineAnnealingLR

Hi, we updated a PR to fix the problem. You can check it! (we use: from diffusers.optimization import get_cosine_schedule_with_warmup)

CosineAnnealingLR

Hi, we haven't test NoamScheduler, I think using AdamW with lr between 5e-5 to 1e-4 and cosine schedule with warmup steps between 5K to 1W steps will give a more...

Data preparation for TTA example

Sure, We will provide the script recently, and we will provide the processed data for AudioCaps (or more), you can directly download it.

Data preparation for TTA example

> Hi, can you give us some direction of how to use pre-trained models in the TTA or TTM recipe. How do we include these pre-trained make-an-audio (https://drive.google.com/drive/folders/1zZTI3-nHrUIywKFqwxlFO6PjB66JA8jI) or this...

Data preparation for TTA example

We will release AudioCaps dataset in huggingface in one week!

Data preparation for TTA example

> Thank you for your great work! Could you please provide more information about the data format? I am trying to encode a 10s 24000fps wav file into embedding space...

Data preparation for TTA example

We update processed AudioCaps dataset: https://openxlab.org.cn/datasets/Amphion/AudioCaps

Data preparation for TTA example

Hi, since we use the official HifiGAN repo for training the vocoder for TTA, you can reference https://github.com/jik876/hifi-gan to convert waveform to melspectrogram.