icefall
icefall copied to clipboard
WIP: Add doc about FST-based CTC forced alignment.
It is based on CTC FORCED ALIGNMENT API TUTORIAL from torchaudio, but we are using an FST-based approach.
I can produce identical output with torchaudio using https://github.com/k2-fsa/kaldi-decoder.
I am refactoring the code and will prepare at least two colab notebooks.
Hi, The align tool can make the word time stamp is accurate on the begin and end postion ?
Hi, The align tool can make the word time stamp is accurate on the begin and end postion ?
It depends on what model you use.
You can have a look at https://pytorch.org/audio/main/tutorials/ctc_forced_alignment_api_tutorial.html We can produce identical results with torchaudio using the same model.
@csukuangfj Will it be completed soon?
@csukuangfj Will it be completed soon?
Yes. I am working on it now.
@csukuangfj Has k2-based approach been forgot?
No, it is TODO
.
Please use the first approach at present or you can add the second approach with k2 by yourself.
All APIs you need are there. You only need to combine them.