AutoPST How to train SEA model

trafficstars

The pretrained model sea.ckpt just fit dataset which have 82 speaker, However, I have a huge dataset including 300 speaker at least. How could I train a corresponding SAE model？

Aug 10 '21 03:08 cyxomo

Do you mean SEA?

You refer to the SEA paper for training details.

Aug 10 '21 03:08 auspicious3000

I seem to have fallen into a mistake. Actually , in preparing data , the Encoder part of SEA model just be used. But I'm not sure that changing the speaker will make a difference.

Aug 10 '21 03:08 cyxomo

Does it matter if I take my own data and extract the features from the SEA model of 82 speakers that you pre-trained

Aug 10 '21 03:08 cyxomo

Do you mean SEA?

You refer to the SEA paper for training details.

Yeah, sorry for spelling mistake

Aug 10 '21 03:08 cyxomo

The performance might degrade, but feel free to try.

Aug 10 '21 03:08 auspicious3000

The performance might degrade, but feel free to try.

So the right thing to do is to train an SEA model with my own data and then extract the features. Could the sea part training code be provided?

Aug 10 '21 03:08 cyxomo

The majority of the code for SEA is here. You just need a data loader and an optimizer.

Aug 10 '21 03:08 auspicious3000

The majority of the code for SEA is here. You just need a data loader and an optimizer.

OK, do you use the loss function like

Aug 10 '21 03:08 cyxomo

Yes

Aug 10 '21 04:08 auspicious3000

@auspicious3000 what is c_trg in model_sea.Generator.forward ? It is part of Decoder's LSTM, dimension is same as hparams.dim_spk which is 82, but still no idea how to get it ...

Sep 05 '21 14:09 vasyarv

It is the one-hot speaker embedding.

Sep 05 '21 15:09 auspicious3000

Do you mean SEA?

You refer to the SEA paper for training details.

Hi! Could you point me to the SEA paper? I want to make sure I am reading the right one

Nov 08 '21 06:11 stalevna

Self-Expressing Autoencoders for Unsupervised Spoken Term Discovery

Nov 08 '21 06:11 auspicious3000

@auspicious3000 Could you check my codes of SEA training loss below:

mask_sp_real = ~sequence_mask(len_real, cep_real0.size(1))# cep_real0 is MFCC that do not cut by [:, 0:20] mask = (~mask_sp_real).float() self.P = self.P.train() mel_outputs , mel_outputs_B= self.P(cep_real, spk_emb, mask)#mel_outputs_B is output of decoder with input of self Expressing autoencoded Z loss_A = F.mse_loss(mel_outputs, cep_real0,reduction='mean') loss_B = F.mse_loss(mel_outputs_B, cep_real0,reduction='mean') p_loss = loss_A + loss_B

Nov 16 '21 06:11 wang1612

AutoPST AutoPST copied to clipboard

How to train SEA model

AutoPST
AutoPST copied to clipboard