wenet icon indicating copy to clipboard operation
wenet copied to clipboard

加入语言模型解码会掉字

Open wwfcnu opened this issue 6 months ago • 4 comments

我在识别的时候加入语言模型的tlg来解码,会出现很多掉字的现象,wer比没加前要大很多

wwfcnu avatar Jun 03 '25 08:06 wwfcnu

调下acoustic scale

On Tue, Jun 3, 2025, 16:26 wwfcnu @.***> wrote:

wwfcnu created an issue (wenet-e2e/wenet#2735) https://github.com/wenet-e2e/wenet/issues/2735

我在识别的时候加入语言模型的tlg来解码,会出现很多掉字的现象,wer比没加前要大很多

— Reply to this email directly, view it on GitHub https://github.com/wenet-e2e/wenet/issues/2735, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFN3Q653FUV3DWKMP7QDVT3BVL4XAVCNFSM6AAAAAB6PBF5NOVHI2DSMVQWIX3LMV43ASLTON2WKOZTGEYTEOJXGEYTMMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Mddct avatar Jun 03 '25 08:06 Mddct

调下acoustic scale

调节acoustic scale和length_penalty都有改善,这里优先调节哪一个合适呢

wwfcnu avatar Jun 06 '25 11:06 wwfcnu

调节acoustic scale和length_penalty之后,解码有时候会出现重复的字

wwfcnu avatar Jun 10 '25 12:06 wwfcnu

调节length_penalty=-5.0,掉字也会减少

wwfcnu avatar Jun 18 '25 06:06 wwfcnu