CTC-OptimizedLoss icon indicating copy to clipboard operation
CTC-OptimizedLoss copied to clipboard

why mwer use stop gradient?

Open Mddct opened this issue 3 years ago • 3 comments

why mwer use stop gradient? just a regularization?

Mddct avatar Nov 15 '21 14:11 Mddct

why mwer use stop gradient? just a regularization?

May be Variance reduction

Mddct avatar Nov 15 '21 14:11 Mddct

i find tf ctc beam search will loss the gradients

leixiaoning avatar Dec 03 '21 12:12 leixiaoning

i find tf ctc beam search will loss the gradients

Beam search is just to find candidate paths, gradient is not required in beam search. Gradients are pushed back to logit weight since there are probability P which is computed from logit ​​as input to MWER loss. NBEST path from CTC Beam search can actually be generated offline to speed up training.

TeaPoly avatar Dec 09 '22 01:12 TeaPoly