wenet icon indicating copy to clipboard operation
wenet copied to clipboard

How to add new words during fine-tuning?

Open srdfjy opened this issue 1 year ago • 6 comments

Hi,a pre-trained model's unit.txt contains 1000 words. When fine-tuning based on this pre-trained model, there are 10 new words not in the unit.txt. At this point, adding these 10 new words to the end of unit.txt and assigning them new numbers, is this approach feasible?

srdfjy avatar Apr 10 '24 09:04 srdfjy

freeze other modules except the outputlayer(ctc output && attention decoder output), add new words to your unit.txt ,modify the output size and then tune the model

fclearner avatar Apr 11 '24 02:04 fclearner

THX @fclearner,I will try out what you suggested later.

srdfjy avatar May 06 '24 09:05 srdfjy

@fclearner embedd layer in the decoder doesn't need to be changed??

LiSongRan avatar Jul 12 '24 10:07 LiSongRan

@fclearner embedd layer in the decoder doesn't need to be changed??

Found modules match the output size and change it

fclearner avatar Jul 12 '24 10:07 fclearner

provide examples ?

LiSongRan avatar Jul 12 '24 11:07 LiSongRan

provide examples ?

Try to print the model size or visualize it with netron

fclearner avatar Jul 12 '24 11:07 fclearner

This issue has been automatically closed due to inactivity.

github-actions[bot] avatar Sep 14 '24 01:09 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.

github-actions[bot] avatar Sep 21 '24 01:09 github-actions[bot]