awesome-diarization icon indicating copy to clipboard operation
awesome-diarization copied to clipboard

How to use AMI dataset to evaluate the DER performance?

Open BLack-yzf opened this issue 4 years ago • 5 comments

Hi, I have seen some authors use AMI corpus to make evaluation on diarization task. But there is no more details about how to evaluate specifically. Like how to choose the dev and test part of the AMI,and how to make the corresponding data preparation. Is there any guidance about using AMI corpus to evaluate the task? Thanks. @wq2012

BLack-yzf avatar Jun 04 '20 03:06 BLack-yzf

AMI website provides an official train/dev/test split.

For data preparation, I did most of the work in a pyannote.audio tutorial already.

hbredin avatar Jun 04 '20 06:06 hbredin

Thanks a lot!

BLack-yzf avatar Jun 04 '20 07:06 BLack-yzf

Hi,hbredin. I see your repo has provided the files "MixHeadset.train.rttm" and "MixHeadset.train.uem" for Mix-headset. Is there rttm and uem files for {headset-0,headset-1,headset-2,headset-3}. Thanks. @hbredin

BLack-yzf avatar Jun 04 '20 09:06 BLack-yzf

I don't know. I am not the creator of the AMI corpus. Check the official AMI website.

hbredin avatar Jun 08 '20 13:06 hbredin

Hi hbredin, Your work on AMI data preparation has helped me a lot. Recently, I have met some problems.I hope you can give me some advice. I have download the Mix-headset data and the corresponding rttm files you have provided. I prepare to use the kaldi/ami recipe to make the data preparation. But I can't get the "segments" files. So, I try to use the provided rttm files to produce file "segments". When performing the "md-eval.pl" to make rttms, it causes the error that " Speaker IS1008d.Mix-Headset 1 108.664 -1.032 <NA> <NA>". I have found that some segments overlapped. So, I think the initial process of producing file "segments" from provided rttm files is wrong. Can you provide the right "segment" files for Mix-headset data(train, dev and test)? Hope your reply, thanks. @hbredin

BLack-yzf avatar Jun 11 '20 09:06 BLack-yzf