SpeechSplit Training Pitch F0 converter model P

Hi, Thanks for the complete code! I wanted to check how can I train F0 convertor P. train.py only trains speech split model G.

Kindly help.

Aug 11 '20 11:08 shubham-IISc

The training code for P is almost identical to solver.py. You can get it by modifying solver.py based on the description in the paper.

Aug 11 '20 15:08 auspicious3000

train.py or main.py? I only see main.py

Oct 09 '20 13:10 steven850

train.py or main.py? I only see main.py

solver.py

Oct 13 '20 02:10 niu0717

I still have no idea how to train f0 converter ... only need change code in solver.py...?

Dec 29 '20 05:12 hongyuntw

@shubham-IISc @hongyuntw any idea how to solve this problem? Any help appreciated

https://github.com/auspicious3000/SpeechSplit/issues/28

Jan 17 '21 13:01 FurkanGozukara

@FurkanGozukara I just change the loader in solver.py

i think we need to load two data at same time so maybe add x_real_trg, emb_trg, f0_trg, len_trg = next(data_iter) after line 142,145

now we have two data in one iteration and i just use the preprocess in demo.ipynb try to make same shape with this code below f0_pred = P(uttr_org_pad, f0_trg_onehot)[0]

and use same loss as G

it work but i dont know is it correct or not

Jan 17 '21 15:01 hongyuntw

@hongyuntw thank you for answer but i didn't understand what you mean

could you show as whole code?

Jan 17 '21 17:01 FurkanGozukara

@FurkanGozukara due to some reason i cant copy my whole code to you All change in solver.py class Solver but

you need to create F0converter in build_model function

self.P = F0_Converter(self.hparams)
self.p_optimizer = torch.optim.Adam(self.P.parameters(), self.p_lr, [self.beta1, self.beta2], weight_decat = 0.000001)(or whatever you want)

in reset_grad function you need to add

self.p_optimizer.zero_grad()

in training loop (inside train function) fix the code be like

try:
    x_real_org,emb_org,f0_org,len_org = next(data_iter)
    x_real_trg,emb_trg,f0_trg,len_trg = next(data_iter)
except:
    data_iter = iter(data_loader)
    x_real_org,emb_org,f0_org,len_org = next(data_iter)
    x_real_trg,emb_trg,f0_trg,len_trg = next(data_iter)

preprocess f0_trg based on demo.ipynb (inside train function) here's my code

f0_list = []
for i in range(f0_trg.shape[0]):
    log_f0 = f0_trg[i].cpu().numpy()
    flatten_log_f0 = log_f0.flatten()
    f0_trg_quantized = quantize_f0_numpy(flatten_log_f0)[0]
    f0_trg_onehot  = torch.from_numpy(f0_trg_quantized).to(self.device)
    f0_list.append(f0_trg_onehot)
f0_trg = torch.stack(f0_list)

P forward (inside train function)

f0_pred = self.P(x_real_org,f0_trg)
p_loss_id = F.mse_loss(f0_pred,f0_trg,reduction='mean')

add backward code (inside train function)

Jan 19 '21 01:01 hongyuntw

@hongyuntw thank you for answer

could you edit your answer and write top of them the class or method names where should I put them?

could you change this to a full example "self.p_optimizer = xxxx(whatever you want)"

Jan 19 '21 07:01 FurkanGozukara

@FurkanGozukara
ok done, hope this helps

Jan 20 '21 01:01 hongyuntw

@hongyuntw thank you very much for answer

I am getting error name 'F0_Converter' is not defined

Also what does "add backward code (inside train function)" means?

Here I have uploaded everything to a repository = https://github.com/FurkanGozukara/SpeechSplitTest

Jan 23 '21 08:01 FurkanGozukara

SpeechSplit SpeechSplit copied to clipboard

Training Pitch F0 converter model P

SpeechSplit
SpeechSplit copied to clipboard