Montreal-Forced-Aligner No TextGrid files in output folder, no error message

I've encountered a problem running the MFA where no error is thrown and no TextGrids are written to the Output folder.

I have 9 speakers, with ~20 minutes of speech for each speaker. I have TextGrids with one tier, with relatively short utterances orthographically transcribed in TextGrid format. I have run the mfa_align command using the pretrained english model and the librispeech dictionary. The aligner seems to run fine, and no error is produced, but there are no TextGrid files in the specified output folder.

I have this error with versions 1.1.0 and 1.0.1, and on both Linux and Windows.

Anybody have an idea of what is going on?

Nov 14 '19 00:11 sj-perry

got a similar issue when trying the example. No TextGrid file was produced in the output directory. I checked the log file (default path is ~/Documents/MFA/<corpus_name>/logging/corpus.log) and saw something like The following utterances were ignored due to lack of features. It seems to me the binary has trouble getting mfcc features from the audio files.

Dec 28 '19 06:12 manazhao

I also got no output in ../Montreal-Forced-Aligner/examples/alignment

bin/mfa_align ../Montreal-Forced-Aligner/examples/ch data-mandarin/chinese.dict.txt pretrained_models/mandarin.zip ../Montreal-Forced-Aligner/examples/alignment Setting up corpus information... Number of speakers in corpus: 1, average number of utterances per speaker: 5.0 Creating dictionary information... Setting up corpus_data directory... Generating base features (mfcc)... Calculating CMVN... Done with setup. Done! Everything took 1.4139506816864014 seconds

Apr 17 '20 11:04 ABC0408

using 1.0.1 on mac.

I got output (and they are accurate) when I try my own english example, my own Spanish example, and the mandarin example they provided at https://montreal-forced-aligner.readthedocs.io/en/latest/example.html, but no output when I try my own mandarin wav. got

mandarin_wav sample_mandarin_dict.txt pretrained_models/mandarin.zip output
Setting up corpus information...
Number of speakers in corpus: 1, average number of utterances per speaker: 2.0
Creating dictionary information...
Setting up training data...
Calculating MFCCs...
Calculating CMVN...
Number of speakers in corpus: 1, average number of utterances per speaker: 2.0
Done with setup.
100%|█████████████████████████████████████████████| 2/2 [00:01<00:00,  1.07it/s]
Done! Everything took 4.492555856704712 seconds

People has pointed out that this was the result of failed alignment, errors logged in ~/Documents/MFA/XXXXX/tri_ali/log/align.0.0.log (https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/issues/84) indeed, compare the log for my succeded Spanish

gmm-align-compiled --transition-scale=1.0 --acoustic-scale=0.1 --self-loop-scale=0.1 --beam=10 --retry-beam=40 --careful=false 'gmm-boost-silence --boost=1.0 6 "/Users/xzfang/Documents/MFA/sample_spanish_wav/tri_ali/0.mdl" - |' ark:/Users/xzfang/Documents/MFA/sample_spanish_wav/tri_ali/fsts.0 ark:/Users/xzfang/Documents/MFA/sample_spanish_wav/train/split1/cmvndeltafeats_fmllr.0 ark:- 
gmm-boost-silence --boost=1.0 6 /Users/xzfang/Documents/MFA/sample_spanish_wav/tri_ali/0.mdl - 
WARNING (gmm-boost-silence[5.4.251~1-094d2]:main():gmm-boost-silence.cc:82) The pdfs for the silence phones may be shared by other phones (note: this probably does not matter.)
LOG (gmm-boost-silence[5.4.251~1-094d2]:main():gmm-boost-silence.cc:93) Boosted weights for 5 pdfs, by factor of 1
LOG (gmm-boost-silence[5.4.251~1-094d2]:main():gmm-boost-silence.cc:103) Wrote model to -
LOG (gmm-align-compiled[5.4.251~1-094d2]:main():gmm-align-compiled.cc:127) Savannah_beso_a_Emilia
LOG (gmm-align-compiled[5.4.251~1-094d2]:main():gmm-align-compiled.cc:127) Savannah_pateo_a_Emilia
LOG (gmm-align-compiled[5.4.251~1-094d2]:main():gmm-align-compiled.cc:135) Overall log-likelihood per frame is -109.2 over 551 frames.
LOG (gmm-align-compiled[5.4.251~1-094d2]:main():gmm-align-compiled.cc:137) Retried 0 out of 2 utterances.
LOG (gmm-align-compiled[5.4.251~1-094d2]:main():gmm-align-compiled.cc:139) Done 2, errors on 0

and the log for my failed mandarin

gmm-align-compiled --transition-scale=1.0 --acoustic-scale=0.1 --self-loop-scale=0.1 --beam=10 --retry-beam=40 --careful=false 'gmm-boost-silence --boost=1.0 6 "/Users/xzfang/Documents/MFA/sample_mandarin_wav_file_name_no_chinese_char/tri_ali/0.mdl" - |' ark:/Users/xzfang/Documents/MFA/sample_mandarin_wav_file_name_no_chinese_char/tri_ali/fsts.0 ark:/Users/xzfang/Documents/MFA/sample_mandarin_wav_file_name_no_chinese_char/train/split1/cmvndeltafeats_fmllr.0 ark:- 
gmm-boost-silence --boost=1.0 6 /Users/xzfang/Documents/MFA/sample_mandarin_wav_file_name_no_chinese_char/tri_ali/0.mdl - 
WARNING (gmm-boost-silence[5.4.251~1-094d2]:main():gmm-boost-silence.cc:82) The pdfs for the silence phones may be shared by other phones (note: this probably does not matter.)
LOG (gmm-boost-silence[5.4.251~1-094d2]:main():gmm-boost-silence.cc:93) Boosted weights for 5 pdfs, by factor of 1
LOG (gmm-boost-silence[5.4.251~1-094d2]:main():gmm-boost-silence.cc:103) Wrote model to -
LOG (gmm-align-compiled[5.4.251~1-094d2]:main():gmm-align-compiled.cc:127) 1
WARNING (gmm-align-compiled[5.4.251~1-094d2]:AlignUtteranceWrapper():decoder-wrappers.cc:466) Retrying utterance 1 with beam 40
WARNING (gmm-align-compiled[5.4.251~1-094d2]:AlignUtteranceWrapper():decoder-wrappers.cc:475) Did not successfully decode file 1, len = 666
LOG (gmm-align-compiled[5.4.251~1-094d2]:main():gmm-align-compiled.cc:127) 2
WARNING (gmm-align-compiled[5.4.251~1-094d2]:AlignUtteranceWrapper():decoder-wrappers.cc:466) Retrying utterance 2 with beam 40
WARNING (gmm-align-compiled[5.4.251~1-094d2]:AlignUtteranceWrapper():decoder-wrappers.cc:475) Did not successfully decode file 2, len = 666
LOG (gmm-align-compiled[5.4.251~1-094d2]:main():gmm-align-compiled.cc:135) Overall log-likelihood per frame is nan over 0 frames.
LOG (gmm-align-compiled[5.4.251~1-094d2]:main():gmm-align-compiled.cc:137) Retried 2 out of 2 utterances.
LOG (gmm-align-compiled[5.4.251~1-094d2]:main():gmm-align-compiled.cc:139) Done 0, errors on 2

i hope this is only a problem with mandarin -- I am using p2fa (https://web.sas.upenn.edu/phonetics-lab/facilities/) for both english and mandarine fine.

btw, stereo is not a problem here(https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/issues/107), 1.0.1 can handle stereo, my english wav was stereo.

Jun 07 '20 20:06 xf15

I had the same issue (no TextGrids output files).

Making sure all the words are in the dictionary fixed it for me (i.e. no prompt to fix words not in the dictionary and an empty oovs_found.txt file).

Oct 16 '20 19:10 simlmx

Using Windows 10 I had the same issue, No TextGrids output files instead I find an empty file oovs_found.txt file.

This was the result of failed alignment, errors logged in gmm-align-compiled --transition-scale=1.0 --acoustic-scale=0.1 --self-loop-scale=0.1 --beam=10 --retry-beam=40 --careful=false 'gmm-boost-silence --boost=1.0 6 "C:\Users\Brandon/Documents/MFA\data\tri_ali\0.mdl" - |' 'ark:C:\Users\Brandon/Documents/MFA\data\tri_ali\fsts.0' 'ark:C:\Users\Brandon/Documents/MFA\data\train\split1\cmvndeltafeats_fmllr.0' ark:- gmm-boost-silence --boost=1.0 6 'C:\Users\Brandon/Documents/MFA\data\tri_ali\0.mdl' - WARNING (gmm-boost-silence[5.4-win]:main():e:\dev\tools\kaldi\src\gmmbin\gmm-boost-silence.cc:82) The pdfs for the silence phones may be shared by other phones (note: this probably does not matter.) LOG (gmm-boost-silence[5.4-win]:main():e:\dev\tools\kaldi\src\gmmbin\gmm-boost-silence.cc:93) Boosted weights for 5 pdfs, by factor of 1 LOG (gmm-boost-silence[5.4-win]:main():e:\dev\tools\kaldi\src\gmmbin\gmm-boost-silence.cc:103) Wrote model to - LOG (gmm-align-compiled[5.4-win]:main():e:\dev\tools\kaldi\src\gmmbin\gmm-align-compiled.cc:135) Overall log-likelihood per frame is -nan(ind) over 0 frames. LOG (gmm-align-compiled[5.4-win]:main():e:\dev\tools\kaldi\src\gmmbin\gmm-align-compiled.cc:137) Retried 0 out of 0 utterances. LOG (gmm-align-compiled[5.4-win]:main():e:\dev\tools\kaldi\src\gmmbin\gmm-align-compiled.cc:139) Done 0, errors on 0

No errors and no done

Dec 14 '20 22:12 Davidelvis

Try increasing `beam` value.

By default it is 10. I had an audio of 30 sec, for that I used beam=100. If you are using CLI , then add argument mfa align ... --beam 100. Apart from that I found that TextGrid are also saved into the temporary directory, like if you are using argument -t or --temp then you will find your textgrids in <folder_name>_pretrained_aligner/pretrained_aligner/textgrids.

Feb 01 '22 16:02 ambiSk

Another project relies on this tool. I also encountered a similar problem when using that, and haven't found a solution yet. Who can help me?

Mar 10 '23 12:03 zhaolibo1989

Just encountered this problem: log was not showing any problems ("Done XX, errors on 0") but no TextGrid files were appearing in output folder. Increasing beam didn't work.

Eventually fixed the issue by adding the --clean flag when running align. Might be good to point out in the intro that default behavior on validate and align is to not overwrite previous runs!

Jun 18 '23 02:06 wesley-js-leong

adding --clean and --overwrite worked for me! https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/configuration/index.html

mfa align --clean --overwrite ...

got this idea from the tutorial: https://www.youtube.com/watch?v=phVZijLo9ro

Nov 06 '23 17:11 halannhile

Montreal-Forced-Aligner Montreal-Forced-Aligner copied to clipboard

No TextGrid files in output folder, no error message

Try increasing beam value.

Montreal-Forced-Aligner
Montreal-Forced-Aligner copied to clipboard

Try increasing `beam` value.