AMICorpusXML
AMICorpusXML copied to clipboard
A question about data format
Hello. First of all thank you for this wonderful library.
I have a question about the transcription format.
For each .txt
file in data/ami-transcripts
, does each line denote a single speaker?
For example, would each line of EN2001a belong to a separate speaker, of which there are 5?
Thank you!
Yes you are correct. But this doesn’t preserve the speaker turns from the actual meetings. And I also found some discrepancies in the transcript files and the speaker transcript files for ES2002c. But I need to check and verify again.