basic format discussion (rttm)

Open alecristia opened this issue 7 years ago • 1 comments

Pro's rttm

Con's rttm in divime

poorly implemented right now (but this should be fixed anyway)
we could use the convention but for the transcript part, we'd have all time stamps with asterisk
not easy to read by target users (but this could be fixed by generating other formats)
unnecessarily complete for several tasks (speech detection)

We discuss alternative formats:

stm NIST (for transcriptions) file, channel, speaker, beg, (dur/end), category [male, far... properties], transcription
ctm NIST (for phones)
WCE: own format
NOTE including more formats means more complexity

CONCLUSION:

[ ] fix our use of rttm and make it standard for sad/vad, talker, and role diarization, and VCM --> all using same eval scripts
[ ] for WCE, as well as input for these, we'll use stm http://www1.icsi.berkeley.edu/Speech/docs/sctk-1.2/infmts.htm#stm_fmt_name_0
we are still waiting to see how we eval WCE

Note: Check Coconut for conversion across formats

Nov 29 '18 15:11 alecristia

after further discussions, we decide VCM will also write its output to the "speaker ID" column, and thus use eval from evalDiar

Nov 29 '18 19:11 alecristia