snowfall
snowfall copied to clipboard
Can we merge a batch of text into one decoding graph?
Hello, I'm learning using k2 to implement CTC. In the "create_decoding_graph" function, it seems like merging a batch of label text into one decoding graph, is it reasonable? https://github.com/k2-fsa/snowfall/blob/26398e8e98a3ff83664a43b60884853514ba94e2/egs/librispeech/asr/simple_v1/train.py#L34
def create_decoding_graph(texts, L, symbols):
word_ids_list = []
for text in texts:
filter_text = [
i if i in symbols._sym2id else '<UNK>' for i in text.split(' ')
]
word_ids = [symbols.get(i) for i in filter_text]
word_ids_list.append(word_ids)
And, what's the meaning of the following line, for ctc blank self loop? https://github.com/k2-fsa/snowfall/blob/26398e8e98a3ff83664a43b60884853514ba94e2/egs/librispeech/asr/simple_v1/train.py#L44
Sure, definitely we need to do this, we also plan to cache those static decoding graph at the beginning of training.
In my opinion, we should create an independent label graph for each utterance.That means we can get a list of decoding graph at the begining of training, or during training. But according to the above code, it seems like merging those into one graph. Is my understanding right?
The 'fsa' object actually supports an array of Fsa's. They are not merged, they are separate but stored in one data structure.
On Fri, Nov 20, 2020 at 2:14 PM yaguang [email protected] wrote:
In my opinion, we should create an independent label graph for each utterance.That means we can get a list of decoding graph at the begining of training, or during training. But according to the above code, it seems like merging those into one graph. Is my understanding right?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/issues/24#issuecomment-730876810, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLOYBEA3NTETJQBEA5FTSQYCMJANCNFSM4T4KRT3Q .
I see. So what's the meaning of "add_epsilon_self_loops"? https://github.com/k2-fsa/snowfall/blob/26398e8e98a3ff83664a43b60884853514ba94e2/egs/librispeech/asr/simple_v1/train.py#L44
We add self-loops with symbol zero (epsilon) here so it can match blank in CTC training. You may want to check the code intersect_dense_pruned below, we do CTC alignment with FSA intersection. Noted we view network output as an Fsa (dense_fsa_vec) as well
How to deal with repeated characters?
Repeated characters not allowed in the successful path (currently, anyway)
On Fri, Nov 20, 2020 at 3:02 PM yaguang [email protected] wrote:
How to deal with repeated characters?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/issues/24#issuecomment-730950034, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLOYWD3NXWTI3ZYOXPL3SQYH7TANCNFSM4T4KRT3Q .
I see. Maybe i can try to implement a standard CTC with k2, is it worth doing?
Sure! Would just require changes in the FSA topology.
On Fri, Nov 20, 2020 at 3:12 PM yaguang [email protected] wrote:
I see. Maybe i can try to implement a standard CTC with k2, is it worth doing?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/issues/24#issuecomment-730954313, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO5OWFI46MU6AIXOV43SQYJHDANCNFSM4T4KRT3Q .