StreamingTransformer
StreamingTransformer copied to clipboard
Can you explain the algorithm the prefix_recognize used?
In streamin_transformer.py, prefix_recognize looks like frame-synchronize decoding algorithm, and merges chunk decoding and trigger decoding。I try to search papers about chunk transformer and trigger attention, but not found! Can you show me the paper that introduced the algorithm? I also have some questions about ctc prefix search in the code. In line 662-664: if l_plus not in hype: Pb[l_plus] += lpz[i][0] + ... Pb[l_plus] += lpz[i][c] * Pnb_prev[l_plus]
I doubt the line 664 should be: Pnb[l_plus] += lpz[i][c] * Pnb_prev[l_plus] This satisfies Algorithm 1 in paper: First-pass large vocabulary continuous speech recognition using bi-directional recurrent Dnns
Yes, you are right. I've fixed the bug in my repo now.