icefall icon indicating copy to clipboard operation
icefall copied to clipboard

Finetune hubert

Open marcoyang1998 opened this issue 3 years ago • 10 comments
trafficstars

This recipe implements Hubert transducer model. It supports finetuning a pretrained Hubert model with custom vocabulary using pruned rnnt loss. To use this recipe, you need fairseq as dependency. The finetuning setup (learning rate, optimizer, scheduler, ...) is the same as the original Hubert paper.

marcoyang1998 avatar Sep 27 '22 07:09 marcoyang1998

Here are some finetuning results on 960h:

model name test-clean test-other
Hubert base 2.82 7.09
Hubert large 1.93 3.93

Models are trained using BPE500, WERs are obtained using modified beam search.

marcoyang1998 avatar Sep 27 '22 07:09 marcoyang1998

Could you also update README.md and RESULTS.md?

csukuangfj avatar Sep 27 '22 07:09 csukuangfj

Could you also update README.md and RESULTS.md?

Just did it.

marcoyang1998 avatar Sep 27 '22 08:09 marcoyang1998

Perhaps some of the files in finetune_hubert_transducer can be removed or converted to soft links (such as asr_datamodule.py)?

desh2608 avatar Sep 27 '22 15:09 desh2608

@marcoyang1998 Looks pretty nice ! @marcoyang1998 and @csukuangfj tell me if you need a hand for closing this PR

ezerhouni avatar Oct 03 '22 12:10 ezerhouni

@marcoyang1998 Looks pretty nice ! @marcoyang1998 and @csukuangfj tell me if you need a hand for closing this PR

@ezerhouni Thanks! Let's see what @marcoyang1998 would comment on this.

csukuangfj avatar Oct 03 '22 14:10 csukuangfj

Sorry, I may be slow to respond during the holiday. Will catch up on all of this after the holiday!

marcoyang1998 avatar Oct 05 '22 07:10 marcoyang1998

I updated the huggingface repo. Everything should be ready now.

marcoyang1998 avatar Oct 11 '22 04:10 marcoyang1998

I updated the huggingface repo. Everything should be ready now.

Thanks! Will look at it later today.

csukuangfj avatar Oct 11 '22 04:10 csukuangfj

I need to update pretrained.py as hubert requires waveform as input.

marcoyang1998 avatar Oct 11 '22 07:10 marcoyang1998