Montreal-Forced-Aligner
Montreal-Forced-Aligner copied to clipboard
File "aligner\corpus.py", line 500, in speaker_utterance_info ZeroDivisionError: division by zero
Hi everyone,
So my corpus are bunch of .textgrid files and .wav files. Sampling rates are correct (44k), but when I run this on Windows 10: F:\montreal-forced-aligner>bin\mfa_align F:\hiwi\12november\test F:\montreal-forced-aligner\dictionaries\german.txt german_prosodylab F:\hiwi\12november\result
I got an exception: ZeroDivisionError
how?
Hi everyone,
So my corpus are bunch of .textgrid files and .wav files. Sampling rates are correct (44k), but when I run this on Windows 10: F:\montreal-forced-aligner>bin\mfa_align F:\hiwi\12november\test F:\montreal-forced-aligner\dictionaries\german.txt german_prosodylab F:\hiwi\12november\result
I got an exception: ZeroDivisionError
how?
did you solve this problem?
Hi everyone,
So my corpus are bunch of .textgrid files and .wav files. Sampling rates are correct (44k), but when I run this on Windows 10: F:\montreal-forced-aligner>bin\mfa_align F:\hiwi\12november\test F:\montreal-forced-aligner\dictionaries\german.txt german_prosodylab F:\hiwi\12november\result
I got an exception: ZeroDivisionError
how?
I got the same problem,but now I solved it .Please check your filename ,for example, replace 'xxx.wav xxx' to 'xxx xxx'
I think this comes about because MFA didn't find any files to process (thus zero speakers). In my case it was caused by having .wav files with floating point data, but I believe anything that causes MFA to not find the audio and text data will cause this.
Yes, at the moment there's some audio preprocessing/inspection that's done in Python, which doesn't support as many formats as I would like, so I'd like to move over to sox for this kind of thing. In the meantime you might be able to resample/resave using sox or Praat into a WAV format known to be supported (i.e., 16kHz, 16-bit).
Thank you guys, that are all very helpful! I resampled the files with sox as Michael @mmcauliffe said and the problem has gone.
Hey all, I'm stuck! Here is the code I used to resample the audio to be of 1 channel and 16kHz.
import os
import wave
import audioop
def downsampleWav(src, dst, inrate=44100, outrate=16000, inchannels=2, outchannels=1):
if not os.path.exists(src):
print('Source not found!')
return False
if not os.path.exists(os.path.dirname(dst)):
os.makedirs(os.path.dirname(dst))
try:
s_read = wave.open(src, 'r')
s_write = wave.open(dst, 'w')
except:
print('Failed to open files!')
return False
n_frames = s_read.getnframes()
data = s_read.readframes(n_frames)
try:
converted = audioop.ratecv(data, 2, inchannels, inrate, outrate, None)
if outchannels == 1:
converted = audioop.tomono(converted[0], 2, 1, 0)
except:
print('Failed to downsample wav')
return False
try:
s_write.setparams((outchannels, 2, outrate, 0, 'NONE', 'Uncompressed'))
s_write.writeframes(converted)
except:
print('Failed to write wav')
return False
try:
s_read.close()
s_write.close()
except:
print('Failed to close wav files')
return False
return True
in_path = 'C:/Users/McKinstryJohn/Desktop/.../Resampled/wavs/'
out_path = 'C:/Users/McKinstryJohn/Desktop/.../Resampled/resampled/'
sr = 0
for file in os.listdir(in_path):
if file.endswith('.wav'):
with wave.open(os.path.join(in_path + file), 'rb') as wav_file:
sr = wav_file.getframerate()
downsampleWav(os.path.join(in_path + file), os.path.join(out_path + file),
inrate=sr, outrate=16000)
However, when I run (on Windows 10)...
mfa align ./resampled ./librispeech-lexicon.txt english ./output'
...I get the ZeroDivisonError. This is on new audio files but what could be the issue here?
Hard to say without seeing some logs. Maybe try doing mfa validate ./resampled ./librispeech-lexicon.txt
and see what the output says about the corpus (also see: https://montreal-forced-aligner.readthedocs.io/en/latest/data_validation.html). If you think it's due to resampling, you could try using sox instead to generate the correct format, that's usually my go-to for these kinds of issues.
Thanks for the quick reply, however, I'm still getting the same error after processing the audio with sox. To give you more info, this is the MFA command:
mfa align ./resampled ./librispeech-lexicon.txt english ./output
This is the error log:
Setting up corpus information...
WARNING: Some issues parsing the corpus were detected. Please run the validator to get more information.
Traceback (most recent call last):
File "c:\...\aligner\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\...\aligner\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\...\aligner\Scripts\mfa.exe\__main__.py", line 7, in <module>
File "c:\...\aligner\lib\site-packages\montreal_forced_aligner\command_line\mfa.py", line 290, in main
run_align_corpus(args, acoustic_languages)
File "c:\...\aligner\lib\site-packages\montreal_forced_aligner\command_line\align.py", line 147, in run_align_corpus
align_corpus(args)
File "c:\...\aligner\lib\site-packages\montreal_forced_aligner\command_line\align.py", line 71, in align_corpus
print(corpus.speaker_utterance_info())
File "c:\...\aligner\lib\site-packages\montreal_forced_aligner\corpus\base.py", line 307, in speaker_utterance_info
average_utterances = sum(len(x) for x in self.speak_utt_mapping.values()) / num_speakers
ZeroDivisionError: division by zero
Lastly, the audio files were preprocessed like so:
import glob
import subprocess
in_path = 'C:/.../wavs/'
out_path = 'C:/.../resampled/'
sr = 16000
for file in glob.glob(in_path + '*.wav'):
output_file = out_path + file[len(in_path):]
subprocess.call(f"sox \"{file}\" -c 1 -r {sr} \"{output_file}\"", shell=True)
Sox did give a warning with almost every file, which looked like this:
sox WARN rate: rate clipped 8 samples; decrease volume?
sox WARN dither: dither clipped 7 samples; decrease volume?
Currently looking into that part but any advice would be greatly appreciated! I did try the validate command but the message isn't that informative. On a deeper look, mfa isn't populating the 'self.utt_wav_files' variable which gives the following error
CorpusError('There were no wav files found for transcribing this corpus. Please validate the corpus.')
montreal_forced_aligner.exceptions.CorpusError: There were no wav files found for transcribing this corpus. Please validate the corpus.
Does the resampled directory contain the transcript files as well? If there's no .lab/.txt/.TextGrid files in the same directory with same name .wav files, it can't perform alignment on them.
Thanks for you help! What eventually worked for me was resampling the audio to be of 1 channel and 16kHz using SoX then created .lab files using prosodylab.alignertools. Once I had the lab files and there were no spaces in the file names 'mfa train' worked.
Oh, some of the file names had spaces? That might have caused issues since kaldi uses spaces as a delimiter for their internal files, I'll do some testing with that, see if there's an easy fix to support those
I've just wasted half an hour on this problem (( The mfa train
command was throwing CorpusError('There were no wav files found for transcribing this corpus. Please validate the corpus.')
again and again.
I couldn't figure out the real problem, but probably the first time, I provided the train command with the wrong corpus directory. Then, it's been throwing this exception again and again, until I deleted the acoustic model directory, in my case it's ~/Documents/MFA/${model_name}