fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

Pretrain Hubert base second iteration

Open renadnasser1 opened this issue 2 years ago • 3 comments

I'm training a Hubert model from scratch on 8k Hz audio speech data same as described on the paper, first iteration succeeded. I've started the second iteration where first iteration features were used to learns kmeans clusters. why the follow warning printed for all the training data. should I be concerned ?

[2023-05-02 16:53:22,434][fairseq.data.audio.hubert_dataset][WARNING] - audio and label duration differ too much

renadnasser1 avatar May 03 '23 10:05 renadnasser1

I don't know if it is relevant anymore, anyway, I think you need to set model.label_rate=50 for the second iteration.

MorenoLaQuatra avatar Aug 01 '23 07:08 MorenoLaQuatra

How do you set up a single machine with multiple GPU?

GUOhm230 avatar Oct 16 '24 07:10 GUOhm230

tarted the second iteration where first iteration features were used to learns kmeans clusters. why the follow warning printed for all the training data. should I be concerned ?

Do you need to use the model parameters of the first training for the second iterative training?

GUOhm230 avatar Nov 12 '24 07:11 GUOhm230