nltk
nltk copied to clipboard
Incorrect krippendorffs alpha result with missing value or missing data
from nltk import agreement
rater1 = [1,1,2]
rater2 = [1,1,None]
rater3 = [None,1,2]
taskdata=[[0,str(i),str(rater1[i])] for i in range(0,len(rater1))]+[[1,str(i),str(rater2[i])] for i in range(0,len(rater2))]+[[2,str(i),str(rater3[i])] for i in range(0,len(rater3))]
print(taskdata) # (annotator_id, sample_id, label_id)
ratingtask = agreement.AnnotationTask(data=taskdata)
print("alpha " +str(ratingtask.alpha())) # krippendorffs alpha
alpha 0.33333333333333337
I am not sure if this result is correct. In my example, I assume I have 3 examples needed to annotate. Except the missing values (not every annotator annotate each example), all the annotators provided the same labels. I thought the alpha should be 1. However, I got 0.33.
Does anyone understand is 0.33 correct or 1.0 is correct for krippendorffs alpha?
Thank you!
Hello!
Depending on your interpretation, both could be correct. Upon inspection it seems that None
is taken as an observation. You can tell this by replacing None
with e.g. "3"
:
from nltk.metrics.agreement import AnnotationTask
rater1 = [1, 1, 2]
rater2 = [1, 1, 3]
rater3 = [3, 1, 2]
taskdata = (
[[0, str(i), str(rater1[i]) if rater1[i] else rater1[i]] for i in range(0, len(rater1))]
+ [[1, str(i), str(rater2[i]) if rater2[i] else rater2[i]] for i in range(0, len(rater2))]
+ [[2, str(i), str(rater3[i]) if rater3[i] else rater3[i]] for i in range(0, len(rater3))]
)
print(taskdata) # (annotator_id, sample_id, label_id)
ratingtask = AnnotationTask(data=taskdata)
print("alpha " + str(ratingtask.alpha())) # krippendorffs alpha
also outputs
alpha 0.33333333333333337
There is indeed an argument that None
values should be ignored. This allows for partial annotating, as you mentioned. This would be as simply as modifying this line:
https://github.com/nltk/nltk/blob/ad3c84c792453a44f2d195e5263e72b315774478/nltk/metrics/agreement.py#L310
to be
label_freqs = FreqDist(x["labels"] for x in itemdata if x["labels"] is not None)
Running the aforementioned program with this line change results in:
alpha 1.0
That said, that would only allow missing annotations for the alpha
method, and (perhaps) not for other metrics (i.e. pi
). That's why I'm hesitant to make a pull request for this yet. I'm not very familiar with the agreement
module, nor the research on the topic either.
Perhaps someone has ideas on whether we should make changes here or not?
- Tom Aarsen
This issue is very much related to #2732, which asks how Agreement Krippendorff's alpha handles missing values.
Hello, thank you so much for your kind response, and also the nice code demonstration! This is very helpful to me. Allowing missing data for alpha is exactly what I need.
As you mentioned, I also hope someone who has more experience on this topic could provide more ideas. :)