Finetuning model with customized dataset

Open lkim0402 opened this issue 1 year ago • 1 comments

Hello! I want to use MeloTTS to output in a voice from my customized dataset. I'm fairly new to this, and I'm just mega confused about how I could use the checkpoints or pretrained models and finetune it on top of my own data.

Right now I just have this script which I just run with model.py.

import torch
from melo.api import TTS

# Speed is adjustable
speed = 1.0
device = 'cpu' # or cuda:0

text = "안녕하세요! 오늘은 날씨가 정말 좋네요."
checkpoint_path = '/home/tts/MeloTTS/melo/configs/checkpoint.pth'
config_path = '/home/tts/MeloTTS/melo/data/kor_config.json'

model = TTS(language='KR', device=device, config_path=config_path, ckpt_path=checkpoint_path)
speaker_ids = model.hps.data.spk2id

For the kor_config, I am using the korean config file, and in there I have set my code to

"training_files": "/home/tts/MeloTTS/melo/data/train.list",
 "validation_files": "/home/tts/MeloTTS/melo/data/val.list",

The checkpoint is the korean checkpoint to the korean checkpoint in download_utils.py.

How can I now fine-tune the model on my own data?

Aug 08 '24 08:08 lkim0402

Have you figured out the finetuning pipeline? i need some help!! Thanks

May 24 '25 18:05 rushichavda