wenet How to add new words during fine-tuning？

How to add new words during fine-tuning？

Open srdfjy opened this issue 1 year ago • 6 comments

Hi，a pre-trained model's unit.txt contains 1000 words. When fine-tuning based on this pre-trained model, there are 10 new words not in the unit.txt. At this point, adding these 10 new words to the end of unit.txt and assigning them new numbers, is this approach feasible?

Apr 10 '24 09:04 srdfjy

freeze other modules except the outputlayer(ctc output && attention decoder output), add new words to your unit.txt ,modify the output size and then tune the model

Apr 11 '24 02:04 fclearner

THX @fclearner，I will try out what you suggested later.

May 06 '24 09:05 srdfjy

@fclearner embedd layer in the decoder doesn't need to be changed??

Jul 12 '24 10:07 LiSongRan

@fclearner embedd layer in the decoder doesn't need to be changed??

Found modules match the output size and change it

Jul 12 '24 10:07 fclearner

provide examples ?

Jul 12 '24 11:07 LiSongRan

provide examples ?

Try to print the model size or visualize it with netron

Jul 12 '24 11:07 fclearner

This issue has been automatically closed due to inactivity.

Sep 14 '24 01:09 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.

Sep 21 '24 01:09 github-actions[bot]

wenet wenet copied to clipboard

How to add new words during fine-tuning？

wenet
wenet copied to clipboard