peoples-speech icon indicating copy to clipboard operation
peoples-speech copied to clipboard

Speed up DSAlign

Open galv opened this issue 4 years ago • 0 comments

Right now, we timeout when an audio file fails to align with its transcript within 200 seconds: https://github.com/mlcommons/peoples-speech/pull/27/files#diff-b790cd27585332e1eeca7dab897f1ccd7bcd483181132bd9914f2dd07062534fR401

This means 10% of our files timeout during alignment.

One observation is that DSAlign seems to slow to a crawl when the groundtruth transcript does not match what was actually said in the audio (e.g., the transcript is a translation)

One option is to reimplement some part of DSAlign in Cython. But we should really dive deep into what's going on, and see if there's something better we can do.

galv avatar Jun 23 '21 06:06 galv