Montreal-Forced-Aligner icon indicating copy to clipboard operation
Montreal-Forced-Aligner copied to clipboard

File "aligner\corpus.py", line 500, in speaker_utterance_info ZeroDivisionError: division by zero

Open RealNicolasBourbaki opened this issue 6 years ago • 12 comments

Hi everyone,

So my corpus are bunch of .textgrid files and .wav files. Sampling rates are correct (44k), but when I run this on Windows 10: F:\montreal-forced-aligner>bin\mfa_align F:\hiwi\12november\test F:\montreal-forced-aligner\dictionaries\german.txt german_prosodylab F:\hiwi\12november\result

I got an exception: ZeroDivisionError

how?

RealNicolasBourbaki avatar Dec 23 '18 19:12 RealNicolasBourbaki

Hi everyone,

So my corpus are bunch of .textgrid files and .wav files. Sampling rates are correct (44k), but when I run this on Windows 10: F:\montreal-forced-aligner>bin\mfa_align F:\hiwi\12november\test F:\montreal-forced-aligner\dictionaries\german.txt german_prosodylab F:\hiwi\12november\result

I got an exception: ZeroDivisionError

how?

did you solve this problem?

forwiat avatar Jan 23 '19 05:01 forwiat

Hi everyone,

So my corpus are bunch of .textgrid files and .wav files. Sampling rates are correct (44k), but when I run this on Windows 10: F:\montreal-forced-aligner>bin\mfa_align F:\hiwi\12november\test F:\montreal-forced-aligner\dictionaries\german.txt german_prosodylab F:\hiwi\12november\result

I got an exception: ZeroDivisionError

how?

I got the same problem,but now I solved it .Please check your filename ,for example, replace 'xxx.wav xxx' to 'xxx xxx'

forwiat avatar Jan 23 '19 06:01 forwiat

I think this comes about because MFA didn't find any files to process (thus zero speakers). In my case it was caused by having .wav files with floating point data, but I believe anything that causes MFA to not find the audio and text data will cause this.

MalcolmSlaney avatar Feb 18 '19 09:02 MalcolmSlaney

Yes, at the moment there's some audio preprocessing/inspection that's done in Python, which doesn't support as many formats as I would like, so I'd like to move over to sox for this kind of thing. In the meantime you might be able to resample/resave using sox or Praat into a WAV format known to be supported (i.e., 16kHz, 16-bit).

mmcauliffe avatar Feb 22 '19 19:02 mmcauliffe

Thank you guys, that are all very helpful! I resampled the files with sox as Michael @mmcauliffe said and the problem has gone.

RealNicolasBourbaki avatar Feb 24 '19 20:02 RealNicolasBourbaki

Hey all, I'm stuck! Here is the code I used to resample the audio to be of 1 channel and 16kHz.

import os
import wave

import audioop

def downsampleWav(src, dst, inrate=44100, outrate=16000, inchannels=2, outchannels=1):
    if not os.path.exists(src):
        print('Source not found!')
        return False

    if not os.path.exists(os.path.dirname(dst)):
        os.makedirs(os.path.dirname(dst))

    try:
        s_read = wave.open(src, 'r')
        s_write = wave.open(dst, 'w')
    except:
        print('Failed to open files!')
        return False

    n_frames = s_read.getnframes()
    data = s_read.readframes(n_frames)

    try:
        converted = audioop.ratecv(data, 2, inchannels, inrate, outrate, None)
        if outchannels == 1:
            converted = audioop.tomono(converted[0], 2, 1, 0)
    except:
        print('Failed to downsample wav')
        return False

    try:
        s_write.setparams((outchannels, 2, outrate, 0, 'NONE', 'Uncompressed'))
        s_write.writeframes(converted)
    except:
        print('Failed to write wav')
        return False

    try:
        s_read.close()
        s_write.close()
    except:
        print('Failed to close wav files')
        return False

    return True


in_path = 'C:/Users/McKinstryJohn/Desktop/.../Resampled/wavs/'
out_path = 'C:/Users/McKinstryJohn/Desktop/.../Resampled/resampled/'

sr = 0

for file in os.listdir(in_path):
    if file.endswith('.wav'):
        with wave.open(os.path.join(in_path + file), 'rb') as wav_file:
            sr = wav_file.getframerate()
            
        downsampleWav(os.path.join(in_path + file), os.path.join(out_path + file),
                          inrate=sr, outrate=16000)

However, when I run (on Windows 10)...

mfa align ./resampled ./librispeech-lexicon.txt english ./output'

...I get the ZeroDivisonError. This is on new audio files but what could be the issue here?

MckinstryJ avatar Jan 09 '21 04:01 MckinstryJ

Hard to say without seeing some logs. Maybe try doing mfa validate ./resampled ./librispeech-lexicon.txt and see what the output says about the corpus (also see: https://montreal-forced-aligner.readthedocs.io/en/latest/data_validation.html). If you think it's due to resampling, you could try using sox instead to generate the correct format, that's usually my go-to for these kinds of issues.

mmcauliffe avatar Jan 09 '21 22:01 mmcauliffe

Thanks for the quick reply, however, I'm still getting the same error after processing the audio with sox. To give you more info, this is the MFA command:

mfa align ./resampled ./librispeech-lexicon.txt english ./output

This is the error log:

Setting up corpus information...
WARNING: Some issues parsing the corpus were detected. Please run the validator to get more information.
Traceback (most recent call last):
  File "c:\...\aligner\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\...\aligner\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\...\aligner\Scripts\mfa.exe\__main__.py", line 7, in <module>
  File "c:\...\aligner\lib\site-packages\montreal_forced_aligner\command_line\mfa.py", line 290, in main
    run_align_corpus(args, acoustic_languages)
  File "c:\...\aligner\lib\site-packages\montreal_forced_aligner\command_line\align.py", line 147, in run_align_corpus
    align_corpus(args)
  File "c:\...\aligner\lib\site-packages\montreal_forced_aligner\command_line\align.py", line 71, in align_corpus
    print(corpus.speaker_utterance_info())
  File "c:\...\aligner\lib\site-packages\montreal_forced_aligner\corpus\base.py", line 307, in speaker_utterance_info
    average_utterances = sum(len(x) for x in self.speak_utt_mapping.values()) / num_speakers
ZeroDivisionError: division by zero 

Lastly, the audio files were preprocessed like so:

import glob
import subprocess

in_path = 'C:/.../wavs/'
out_path = 'C:/.../resampled/'

sr = 16000

for file in glob.glob(in_path + '*.wav'):
    output_file = out_path + file[len(in_path):]
    subprocess.call(f"sox \"{file}\" -c 1 -r {sr} \"{output_file}\"", shell=True)

Sox did give a warning with almost every file, which looked like this:

sox WARN rate: rate clipped 8 samples; decrease volume?
sox WARN dither: dither clipped 7 samples; decrease volume?

Currently looking into that part but any advice would be greatly appreciated! I did try the validate command but the message isn't that informative. On a deeper look, mfa isn't populating the 'self.utt_wav_files' variable which gives the following error

CorpusError('There were no wav files found for transcribing this corpus. Please validate the corpus.')
montreal_forced_aligner.exceptions.CorpusError: There were no wav files found for transcribing this corpus. Please validate the corpus.

MckinstryJ avatar Jan 13 '21 00:01 MckinstryJ

Does the resampled directory contain the transcript files as well? If there's no .lab/.txt/.TextGrid files in the same directory with same name .wav files, it can't perform alignment on them.

mmcauliffe avatar Jan 13 '21 00:01 mmcauliffe

Thanks for you help! What eventually worked for me was resampling the audio to be of 1 channel and 16kHz using SoX then created .lab files using prosodylab.alignertools. Once I had the lab files and there were no spaces in the file names 'mfa train' worked.

MckinstryJ avatar Jan 14 '21 01:01 MckinstryJ

Oh, some of the file names had spaces? That might have caused issues since kaldi uses spaces as a delimiter for their internal files, I'll do some testing with that, see if there's an easy fix to support those

mmcauliffe avatar Jan 14 '21 17:01 mmcauliffe

I've just wasted half an hour on this problem (( The mfa train command was throwing CorpusError('There were no wav files found for transcribing this corpus. Please validate the corpus.') again and again. I couldn't figure out the real problem, but probably the first time, I provided the train command with the wrong corpus directory. Then, it's been throwing this exception again and again, until I deleted the acoustic model directory, in my case it's ~/Documents/MFA/${model_name}

asarsembayev avatar Feb 23 '24 06:02 asarsembayev