Pretrain Hubert base second iteration

Open renadnasser1 opened this issue 2 years ago • 3 comments

I'm training a Hubert model from scratch on 8k Hz audio speech data same as described on the paper, first iteration succeeded. I've started the second iteration where first iteration features were used to learns kmeans clusters. why the follow warning printed for all the training data. should I be concerned ?

[2023-05-02 16:53:22,434][fairseq.data.audio.hubert_dataset][WARNING] - audio and label duration differ too much

May 03 '23 10:05 renadnasser1

I don't know if it is relevant anymore, anyway, I think you need to set model.label_rate=50 for the second iteration.

Aug 01 '23 07:08 MorenoLaQuatra

How do you set up a single machine with multiple GPU?

Oct 16 '24 07:10 GUOhm230

tarted the second iteration where first iteration features were used to learns kmeans clusters. why the follow warning printed for all the training data. should I be concerned ?

Do you need to use the model parameters of the first training for the second iterative training?

Nov 12 '24 07:11 GUOhm230