Montreal-Forced-Aligner icon indicating copy to clipboard operation
Montreal-Forced-Aligner copied to clipboard

MFA can't create subdirectories if the input directory has subdirectories

Open MichaelGoodale opened this issue 5 years ago • 1 comments

Currently, if you pass an input folder that has subdirectories containing a wav and textgrid, MFA runs fine, but fails to actually output the forced aligned textgrids, this is due to it being unable to make the necessary subdirectories

Ex: input_dir: -speaker_1: -file1.wav -file1.textgrid

will fail with the following error The following exceptions were encountered during the ouput of the alignments to TextGrids:

file1:
Traceback (most recent call last):

  File "aligner/textgrid.py", line 122, in ctm_to_textgrid

  File "textgrid/textgrid.py", line 722, in write

  File "/usr/lib/python3.5/codecs.py", line 895, in open

FileNotFoundError: [Errno 2] No such file or directory: 'output/speaker_1/file1.TextGrid'

Note: MFA works fine if you create the subdirectories yourself in the output directory. I think all that needs to be changed is a quick call to recreate any subdirectories inside the input directory.

MichaelGoodale avatar Mar 06 '19 18:03 MichaelGoodale

I only start experiencing this issue when I started using .TextGrid files for the transcripts. Before, when I was just using.txt files for the audio transcripts, I didn't experience this issue.

I observed the same behavior as described above where the FileNotFoundError will be raised if a matching sub-directory structure does not already exist in the output parent directory before the align command is run. If you manually create that sub-directory structure under the output parent, the align command completes as expected.

A potential workaround could be to manually create the subdirectory structure into the mfa output directory. This could be achieved by running the commands below:

cd /path/to/src_directory
find ./ -type d -exec mkdir -p /path/to/output_directory/{} \;

The thing you'll need to change is the /path/to/src_directory and /path/to/output_directory values in the commands above. In the original example /path/to/src_directory=input_dir and /path/to/output_directory=output.

The find command will search the current directory (/path/to/src_directory) for directory-type files (-type d) and execute (-exec) the mkdir command with the parent option (-p) on the /path/to/output_directory.

dzubke avatar Feb 23 '21 16:02 dzubke