CTC-OptimizedLoss why mwer use stop gradient?

why mwer use stop gradient?

Open Mddct opened this issue 3 years ago • 3 comments

why mwer use stop gradient? just a regularization?

Nov 15 '21 14:11 Mddct

why mwer use stop gradient? just a regularization?

May be Variance reduction

Nov 15 '21 14:11 Mddct

i find tf ctc beam search will loss the gradients

Dec 03 '21 12:12 leixiaoning

i find tf ctc beam search will loss the gradients

Beam search is just to find candidate paths, gradient is not required in beam search. Gradients are pushed back to logit weight since there are probability P which is computed from logit as input to MWER loss. NBEST path from CTC Beam search can actually be generated offline to speed up training.

Dec 09 '22 01:12 TeaPoly

CTC-OptimizedLoss CTC-OptimizedLoss copied to clipboard

why mwer use stop gradient?

CTC-OptimizedLoss
CTC-OptimizedLoss copied to clipboard