snowfall Can we merge a batch of text into one decoding graph?

trafficstars

Hello, I'm learning using k2 to implement CTC. In the "create_decoding_graph" function, it seems like merging a batch of label text into one decoding graph, is it reasonable? https://github.com/k2-fsa/snowfall/blob/26398e8e98a3ff83664a43b60884853514ba94e2/egs/librispeech/asr/simple_v1/train.py#L34

def create_decoding_graph(texts, L, symbols):
    word_ids_list = []
    for text in texts:
        filter_text = [
            i if i in symbols._sym2id else '<UNK>' for i in text.split(' ')
        ]
        word_ids = [symbols.get(i) for i in filter_text]
        word_ids_list.append(word_ids)

And, what's the meaning of the following line, for ctc blank self loop? https://github.com/k2-fsa/snowfall/blob/26398e8e98a3ff83664a43b60884853514ba94e2/egs/librispeech/asr/simple_v1/train.py#L44

Nov 20 '20 05:11 yaguanghu

Sure, definitely we need to do this, we also plan to cache those static decoding graph at the beginning of training.

Nov 20 '20 06:11 qindazhu

In my opinion, we should create an independent label graph for each utterance.That means we can get a list of decoding graph at the begining of training, or during training. But according to the above code, it seems like merging those into one graph. Is my understanding right?

Nov 20 '20 06:11 yaguanghu

The 'fsa' object actually supports an array of Fsa's. They are not merged, they are separate but stored in one data structure.

On Fri, Nov 20, 2020 at 2:14 PM yaguang [email protected] wrote:

In my opinion, we should create an independent label graph for each utterance.That means we can get a list of decoding graph at the begining of training, or during training. But according to the above code, it seems like merging those into one graph. Is my understanding right?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/issues/24#issuecomment-730876810, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLOYBEA3NTETJQBEA5FTSQYCMJANCNFSM4T4KRT3Q .

Nov 20 '20 06:11 danpovey

I see. So what's the meaning of "add_epsilon_self_loops"? https://github.com/k2-fsa/snowfall/blob/26398e8e98a3ff83664a43b60884853514ba94e2/egs/librispeech/asr/simple_v1/train.py#L44

Nov 20 '20 06:11 yaguanghu

We add self-loops with symbol zero (epsilon) here so it can match blank in CTC training. You may want to check the code intersect_dense_pruned below, we do CTC alignment with FSA intersection. Noted we view network output as an Fsa (dense_fsa_vec) as well

Nov 20 '20 06:11 qindazhu

How to deal with repeated characters?

Nov 20 '20 07:11 yaguanghu

Repeated characters not allowed in the successful path (currently, anyway)

On Fri, Nov 20, 2020 at 3:02 PM yaguang [email protected] wrote:

How to deal with repeated characters?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/issues/24#issuecomment-730950034, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLOYWD3NXWTI3ZYOXPL3SQYH7TANCNFSM4T4KRT3Q .

Nov 20 '20 07:11 danpovey

I see. Maybe i can try to implement a standard CTC with k2, is it worth doing?

Nov 20 '20 07:11 yaguanghu

Sure! Would just require changes in the FSA topology.

On Fri, Nov 20, 2020 at 3:12 PM yaguang [email protected] wrote:

I see. Maybe i can try to implement a standard CTC with k2, is it worth doing?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/issues/24#issuecomment-730954313, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO5OWFI46MU6AIXOV43SQYJHDANCNFSM4T4KRT3Q .

Nov 20 '20 10:11 danpovey

snowfall snowfall copied to clipboard

Can we merge a batch of text into one decoding graph?

snowfall
snowfall copied to clipboard