AMICorpusXML icon indicating copy to clipboard operation
AMICorpusXML copied to clipboard

A question about data format

Open seongminp opened this issue 3 years ago • 1 comments

Hello. First of all thank you for this wonderful library.

I have a question about the transcription format.

For each .txt file in data/ami-transcripts, does each line denote a single speaker?

For example, would each line of EN2001a belong to a separate speaker, of which there are 5?

Thank you!

seongminp avatar Sep 24 '21 05:09 seongminp

Yes you are correct. But this doesn’t preserve the speaker turns from the actual meetings. And I also found some discrepancies in the transcript files and the speaker transcript files for ES2002c. But I need to check and verify again.

saprativa avatar Sep 24 '21 14:09 saprativa