kaldi-offline-transcriber
kaldi-offline-transcriber copied to clipboard
generating alignments
Hi,
Kaldi scripts usually also generate alignments with lattices. You have both lat..gz and ali..gz files.
While in the offline transcriber, we only have the lattices as outputs. Is there any way to also generate alignments?
Thanks!
Alignments in the form of CTM files can already be generated (see https://github.com/alumae/kaldi-offline-transcriber/blob/master/Makefile#L249). I.e., you may invoke
make build/output/foo.ctm
which generates a CTM file for src-audio/foo.mp3
.
If you need alignments in other format (e.g. phone alignments), you may look inside the steps/get_ctm.sh
file an modify it according to you needs).
By the way, is there any existing script to convert ctm files into the "kaldi training data files" text, segments, utt2spk, spk2utt ?