kaldi-offline-transcriber icon indicating copy to clipboard operation
kaldi-offline-transcriber copied to clipboard

generating alignments

Open yasheshgaur opened this issue 8 years ago • 2 comments

Hi,

Kaldi scripts usually also generate alignments with lattices. You have both lat..gz and ali..gz files.

While in the offline transcriber, we only have the lattices as outputs. Is there any way to also generate alignments?

Thanks!

yasheshgaur avatar Dec 22 '15 22:12 yasheshgaur

Alignments in the form of CTM files can already be generated (see https://github.com/alumae/kaldi-offline-transcriber/blob/master/Makefile#L249). I.e., you may invoke

make build/output/foo.ctm

which generates a CTM file for src-audio/foo.mp3.

If you need alignments in other format (e.g. phone alignments), you may look inside the steps/get_ctm.sh file an modify it according to you needs).

alumae avatar Dec 23 '15 00:12 alumae

By the way, is there any existing script to convert ctm files into the "kaldi training data files" text, segments, utt2spk, spk2utt ?

vince62s avatar Apr 10 '16 17:04 vince62s