NAG-BERT Question regarding the speed up

Question regarding the speed up

Open allanj opened this issue 3 years ago • 2 comments

I saw the paper use argmax as the equation to obtain the sequence. I understand that that would be a Viterbi algorithm, where the complexity is again O(n). I'm confused that how is it faster than Auto-Regressive approach

May 19 '21 06:05 allanj

I saw the paper use argmax as the equation to obtain the sequence. I understand that that would be a Viterbi algorithm, where the complexity is again O(n). I'm confused that how is it faster than Auto-Regressive approach

i think the reason is that model only run once, then Viterbi decode. Auto-Regressive should run n

Jul 05 '21 06:07 clearloveclearlove

I saw the paper use argmax as the equation to obtain the sequence. I understand that that would be a Viterbi algorithm, where the complexity is again O(n). I'm confused that how is it faster than Auto-Regressive approach

Hello, thank you for your question. The speed up comes from the fact that NAG-BERT only do the forward computation once, as for autoregressive models they have to do forward pass n times where n is the length of output sequence.

Nov 01 '21 15:11 yxuansu

NAG-BERT NAG-BERT copied to clipboard

Question regarding the speed up

NAG-BERT
NAG-BERT copied to clipboard