Conch-sounds icon indicating copy to clipboard operation
Conch-sounds copied to clipboard

AutoVOT support

Open mmcauliffe opened this issue 5 years ago • 2 comments

@MichaelGoodale High level for implementing AutoVOT here in conch in such a way that it can be used in PolyglotDB (since I apparently never got around to writing documentation for this package...).

From PolyglotDB, a call is made to conch.main.analyze_segments (https://github.com/mmcauliffe/Conch-sounds/blob/master/conch/main.py#L191), which takes a segment mapping object and an analysis function.

The SegmentMapping object (https://github.com/mmcauliffe/Conch-sounds/blob/master/conch/analysis/segments.py#L107) is an object that primary contains a list of segments (specified by sound file, begin, end, channel) with some additional functionality like the ability to group by extra meta data, i.e., speaker.

The analysis function is a callable object descended from BaseAnalysisFunction (https://github.com/mmcauliffe/Conch-sounds/blob/master/conch/analysis/functions.py#L14). The one most similar to what you'll be implementing for AutoVOT is likely to be the PraatAnalysisFunction (https://github.com/mmcauliffe/Conch-sounds/blob/master/conch/analysis/praat.py#L6), which wraps the PraatAnalysisFunction from the Pyraat package (https://github.com/mmcauliffe/Pyraat/blob/master/pyraat/praat_script.py#L72). This function takes a path to Praat and a path to a praat script and then parses the output from it (along with has some inspection for whether the praat script will have the correct input arguments and correct output format).

So depending on whether AutoVOT has parallel processing built in (@msonderegger?), you should be able to skip the multiprocessing part of analyze_segments, instead passing the segments directly to AutoVOT (assuming it supports segmenting larger audio files somehow, maybe textgrids?). A first pass implementation can just do one sound file at a time and let conch handle the parallel part, perhaps, with optimizations later to let AutoVOT handle it somehow?

mmcauliffe avatar Aug 13 '18 20:08 mmcauliffe

@MichaelGoodale The AutoVOT use case to go with, from the code/tutorial at https://github.com/mlml/autovot is "VOT decoding mode 1" -- Use an existing classifier to measure VOT for stops in a set of textgrids and corresponding wav files.

Good news: AutoVOT interface is now 100% Python -- I had completely forgotten this. So that should make it easier to integrate functionality into conch.

Bad news: AutoVOT assumes textgrids of a particular form (with associated wav file), which I suppose conch will have to create. I think there are examples in the tutorial, or I have lots if you need.

As @mmcauliffe says, probably best as a first pass to do one sound file at a time -- this will mean one TextGrid/wav file pair (prob with >1 stops flagged for VOT annotation). AutoVOT by default writes out a TextGrid with VOT predictions on a new tier, but I think you'll want to use the --csv_file flag to speed things up.

As data for development, you could use the example data included in the AutoVOT tutorial.

msonderegger avatar Aug 15 '18 17:08 msonderegger

@MichaelGoodale could you update on progress here and ask any questions you had for Michael M? He is on vacation tomorrow -> 9/25.

msonderegger avatar Sep 12 '18 13:09 msonderegger