Montreal-Forced-Aligner
Montreal-Forced-Aligner copied to clipboard
[BUG] Exported TextGrid files less than wav files
Debugging checklist
[Yes] Have you updated to latest MFA version?
[Yes] Have you tried rerunning the command with the --clean
flag?
Describe the issue Steps:
docker pull mmcauliffe/montreal-forced-aligner:v2.2.17
docker run -it -v /mydata:/data mmcauliffe/montreal-forced-aligner:v2.2.17
# try to train and align an Arabic corpus
mfa train --clean -j 50 --single_speaker /data/mgb2/segments/train_mer20/ /data/dict/arabic_mfa.dict /data/model/arabic_accoustic_model.zip --output_directory /data/mgb2/aligned/train_mer20/
Results: There was a Permission denied error when exporting:
INFO Exporting sat_4_ali TextGrids to /data/mgb2/aligned/train_mer20...
ERROR There was an error in the run, please see the log.
PermissionError: [Errno 13] Permission denied: '/data/mgb2/aligned/train_mer20'
So I grant permission to the /data/ dir and rerun with:
mfa train --no_clean -j 50 --single_speaker /data/mgb2/segments/train_mer20/ /data/dict/arabic_mfa.dict /data/model/arabic_accoustic_model.zip --output_directory /data/mgb2/aligned/train_mer20/```
INFO Setting up corpus information...
INFO Found 1 speaker across 352416 files, average number of utterances per speaker:
352416.0
INFO Jobs already initialized.
INFO Text already normalized.
INFO Creating corpus split with features...
96% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸━ 336,602/352,416 [ 0:00:10 < 0:00:01 , 51,389 it/s ]
INFO Features already generated.
INFO Filtering utterances for training...
INFO Pronunciation probability estimation already done, loading saved
probabilities...
INFO Pronunciation probability estimation already done, loading saved
probabilities...
INFO Accumulating transition stats...
76% ━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━ 268,022/352,416 [ 0:00:04 < 0:00:01 , 296,181 it/s ]
INFO Finished accumulating transition stats!
INFO Beginning phone LM training...
INFO Collecting training data...
78% ━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━ 275,887/352,416 [ 0:00:09 < 0:00:03 , 35,194 it/s ]
INFO Training model...
INFO Completed training in 48.3777596950531 seconds!
INFO Saved model to /data/model/arabic_accoustic_model.zip
WARNING Alignment analysis not available without using postgresql
INFO Exporting sat_4_ali TextGrids to /data/mgb2/aligned/train_mer20...
99% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 348,647/352,416 [ 0:01:00 < 0:00:01 , 11,376 it/s ]
INFO Finished exporting TextGrids to /data/mgb2/aligned/train_mer20!
INFO Done! Everything took 128.847 seconds
It seems successful. However, when I counted the number of textgrid files, it was only 279222, while the number of original wav files is 352416.
Could you tell me why there are missing textgrids? Thank you!
For Reproducing your issue Please fill out the following:
- Corpus structure
- What language is the corpus in? Arabic
- How many files/speakers? 1 speaker, 352416 wav files
- Are you using lab files or TextGrid files for input? lab
- Dictionary
- Are you using a dictionary from MFA? If so, which one? yes, the arabic mfa dict
- If it's a custom dictionary, what is the phoneset?
- Acoustic model
- If you're using an acoustic model, is it one download through MFA? If so, which one?
- If it's a model you've trained, what data was it trained on?
Log file
Please attach the log file for the run that encountered an error (by default these will be stored in ~/Documents/MFA
).
2023-11-27 07:18:20,714 - mfa - DEBUG - Skipping pronunciation_probabilities_2 alignments
2023-11-27 07:18:21,702 - mfa - DEBUG - Skipping sat_4 alignments
2023-11-27 07:18:21,702 - mfa - INFO - Accumulating transition stats...
2023-11-27 07:18:34,979 - mfa - DEBUG - Accumulating transition stats took 13.275 seconds
2023-11-27 07:18:34,979 - mfa - INFO - Finished accumulating transition stats!
2023-11-27 07:18:34,989 - mfa - INFO - Beginning phone LM training...
2023-11-27 07:18:34,989 - mfa - INFO - Collecting training data...
2023-11-27 07:18:45,278 - mfa - INFO - Training model...
2023-11-27 07:18:46,904 - mfa - INFO - Completed training in 49.68739151954651 seconds!
2023-11-27 07:18:52,293 - mfa - INFO - Saved model to /data/model/arabic_accoustic_model.zip
2023-11-27 07:18:52,299 - mfa - DEBUG - Skipping sat_4 alignments
2023-11-27 07:18:52,300 - mfa - WARNING - Alignment analysis not available without using postgresql
2023-11-27 07:18:52,302 - mfa - INFO - Exporting sat_4_ali TextGrids to /data/mgb2/aligned/train_mer20...
2023-11-27 07:18:52,303 - mfa - ERROR - There was an error in the run, please see the log.
2023-11-27 07:19:57,022 - mfa - DEBUG - Beginning run for train_mer20
2023-11-27 07:19:57,022 - mfa - DEBUG - Using "global" profile
2023-11-27 07:19:57,022 - mfa - DEBUG - Using multiprocessing with 50
2023-11-27 07:19:57,022 - mfa - DEBUG - Set up logger for MFA version: 2.2.18.dev0+gf8d678f.d20230820
2023-11-27 07:19:57,056 - mfa - DEBUG - Using UNKNOWN
2023-11-27 07:19:57,245 - mfa - DEBUG - Loaded dictionary in 0.189 seconds
2023-11-27 07:19:57,248 - mfa - INFO - Setting up corpus information...
2023-11-27 07:19:57,252 - mfa - DEBUG - Successfully loaded from temporary files
2023-11-27 07:19:57,269 - mfa - INFO - Found 1 speaker across 352416 files, average number of utterances per speaker: 352416.0
2023-11-27 07:19:57,270 - mfa - DEBUG - Loaded corpus in 0.024 seconds
2023-11-27 07:19:57,272 - mfa - INFO - Jobs already initialized.
2023-11-27 07:19:57,273 - mfa - DEBUG - Initialized jobs in 0.003 seconds
2023-11-27 07:19:57,273 - mfa - INFO - Text already normalized.
2023-11-27 07:19:57,635 - mfa - DEBUG - Wrote lexicon information in 0.361 seconds
2023-11-27 07:19:57,637 - mfa - INFO - Creating corpus split with features...
2023-11-27 07:20:08,391 - mfa - DEBUG - Created corpus split directory in 10.756 seconds
2023-11-27 07:20:08,399 - mfa - INFO - Features already generated.
2023-11-27 07:20:08,400 - mfa - DEBUG - Generated features in 0.008 seconds
2023-11-27 07:20:08,400 - mfa - DEBUG - Setting up corpus took 11.344 seconds
2023-11-27 07:20:08,414 - mfa - INFO - Filtering utterances for training...
2023-11-27 07:20:11,668 - mfa - DEBUG - Skipping monophone alignments
2023-11-27 07:20:11,708 - mfa - DEBUG - Skipping triphone alignments
2023-11-27 07:20:11,749 - mfa - DEBUG - Skipping lda alignments
2023-11-27 07:20:11,808 - mfa - DEBUG - Skipping sat alignments
2023-11-27 07:20:11,949 - mfa - DEBUG - Skipping sat_2 alignments
2023-11-27 07:20:11,952 - mfa - INFO - Pronunciation probability estimation already done, loading saved probabilities...
2023-11-27 07:20:23,176 - mfa - DEBUG - Skipping pronunciation_probabilities alignments
2023-11-27 07:20:23,513 - mfa - DEBUG - Skipping sat_3 alignments
2023-11-27 07:20:23,516 - mfa - INFO - Pronunciation probability estimation already done, loading saved probabilities...
2023-11-27 07:20:34,475 - mfa - DEBUG - Skipping pronunciation_probabilities_2 alignments
2023-11-27 07:20:35,498 - mfa - DEBUG - Skipping sat_4 alignments
2023-11-27 07:20:35,498 - mfa - INFO - Accumulating transition stats...
2023-11-27 07:20:48,506 - mfa - DEBUG - Accumulating transition stats took 13.007 seconds
2023-11-27 07:20:48,506 - mfa - INFO - Finished accumulating transition stats!
2023-11-27 07:20:48,516 - mfa - INFO - Beginning phone LM training...
2023-11-27 07:20:48,517 - mfa - INFO - Collecting training data...
2023-11-27 07:20:58,353 - mfa - INFO - Training model...
2023-11-27 07:21:00,021 - mfa - INFO - Completed training in 48.3777596950531 seconds!
2023-11-27 07:21:05,473 - mfa - INFO - Saved model to /data/model/arabic_accoustic_model.zip
2023-11-27 07:21:05,482 - mfa - DEBUG - Skipping sat_4 alignments
2023-11-27 07:21:05,482 - mfa - WARNING - Alignment analysis not available without using postgresql
2023-11-27 07:21:05,486 - mfa - INFO - Exporting sat_4_ali TextGrids to /data/mgb2/aligned/train_mer20...
2023-11-27 07:22:05,859 - mfa - INFO - Finished exporting TextGrids to /data/mgb2/aligned/train_mer20!
2023-11-27 07:22:05,862 - mfa - DEBUG - Exported TextGrids in a total of 60.374 seconds
2023-11-27 07:22:05,865 - mfa - INFO - Done! Everything took 128.847 seconds
Desktop (please complete the following information):
- OS: [e.g. Windows, OSX, Linux] Linux
- Version [e.g. MacOSX 10.15, Ubuntu 20.04, Windows 10, etc]
- Any other details about the setup (Cloud, Docker, etc)
Additional context Add any other context about the problem here.
Same here. Got 14 TextGrids instead of 2176
same for me