user-satisfaction-simulation icon indicating copy to clipboard operation
user-satisfaction-simulation copied to clipboard

different number of annotators

Open SarikGhazarian opened this issue 2 years ago • 3 comments

Hello, I was looking into human annotations for USS dataset and I realized different conversations are annotated by different number of annotators. May I know what is the reason and how the number of utterances with specific annotations ratings in the table have been calculated?

SarikGhazarian avatar Oct 04 '23 18:10 SarikGhazarian

Hello,

The final score of a utterance is determined by the majority annotation. For example, a data sample with annotations (3, 3, 4) will translate to a score of 3, as 3 is the majority annotation.

We conduct additional labeling on entire conversations when inconsistencies (i.e., unable to determine majority) arise from initial annotators. Thus some data may receive more than 3 annotations.

sunnweiwei avatar Oct 04 '23 18:10 sunnweiwei

thanks for the clarification! so in that case let's say for the first conversation from MWOZ that has labels 3,3,2 why did you add another annotator though the majority of the votes was clear?

SarikGhazarian avatar Oct 04 '23 20:10 SarikGhazarian

Thanks for your question! We have added some additional annotations to the data provided by a select group of outlier annotators, i.e., those whose final annotation score distribution seem inconsistent with that of most annotators, such as being more likely to give high scores. We did not remove these outlier annotators due to some degree of subjectivity in the dialogue evaluation.

sunnweiwei avatar Oct 17 '23 16:10 sunnweiwei