SpeechSplit
SpeechSplit copied to clipboard
Training Pitch F0 converter model P
Hi, Thanks for the complete code! I wanted to check how can I train F0 convertor P. train.py only trains speech split model G.
Kindly help.
The training code for P is almost identical to solver.py. You can get it by modifying solver.py based on the description in the paper.
train.py or main.py? I only see main.py
train.py or main.py? I only see main.py
solver.py
I still have no idea how to train f0 converter ... only need change code in solver.py...?
@shubham-IISc @hongyuntw any idea how to solve this problem? Any help appreciated
https://github.com/auspicious3000/SpeechSplit/issues/28
@FurkanGozukara I just change the loader in solver.py
i think we need to load two data at same time so maybe add x_real_trg, emb_trg, f0_trg, len_trg = next(data_iter) after line 142,145
now we have two data in one iteration and i just use the preprocess in demo.ipynb try to make same shape with this code below f0_pred = P(uttr_org_pad, f0_trg_onehot)[0]
and use same loss as G
it work but i dont know is it correct or not
@hongyuntw thank you for answer but i didn't understand what you mean
could you show as whole code?
@FurkanGozukara due to some reason i cant copy my whole code to you All change in solver.py class Solver but
- you need to create F0converter in build_model function
self.P = F0_Converter(self.hparams)
self.p_optimizer = torch.optim.Adam(self.P.parameters(), self.p_lr, [self.beta1, self.beta2], weight_decat = 0.000001)(or whatever you want)
- in reset_grad function you need to add
self.p_optimizer.zero_grad()
- in training loop (inside train function) fix the code be like
try:
x_real_org,emb_org,f0_org,len_org = next(data_iter)
x_real_trg,emb_trg,f0_trg,len_trg = next(data_iter)
except:
data_iter = iter(data_loader)
x_real_org,emb_org,f0_org,len_org = next(data_iter)
x_real_trg,emb_trg,f0_trg,len_trg = next(data_iter)
- preprocess f0_trg based on demo.ipynb (inside train function) here's my code
f0_list = []
for i in range(f0_trg.shape[0]):
log_f0 = f0_trg[i].cpu().numpy()
flatten_log_f0 = log_f0.flatten()
f0_trg_quantized = quantize_f0_numpy(flatten_log_f0)[0]
f0_trg_onehot = torch.from_numpy(f0_trg_quantized).to(self.device)
f0_list.append(f0_trg_onehot)
f0_trg = torch.stack(f0_list)
- P forward (inside train function)
f0_pred = self.P(x_real_org,f0_trg)
p_loss_id = F.mse_loss(f0_pred,f0_trg,reduction='mean')
- add backward code (inside train function)
@hongyuntw thank you for answer
could you edit your answer and write top of them the class or method names where should I put them?
could you change this to a full example "self.p_optimizer = xxxx(whatever you want)"
@FurkanGozukara
ok done, hope this helps
@hongyuntw thank you very much for answer
I am getting error name 'F0_Converter' is not defined
Also what does "add backward code (inside train function)" means?
Here I have uploaded everything to a repository = https://github.com/FurkanGozukara/SpeechSplitTest
