QaNER icon indicating copy to clipboard operation
QaNER copied to clipboard

F1 score may be obly concern about the entity

Open Godxia opened this issue 1 year ago • 0 comments

Thank you for your project. I noticed that the confusion matrix is used to calculate the metrics, and when calculating the final macro F1 score, the "O" type is not excluded. I think this is incorrect. Since there are far fewer entities compared to "O", and the focus is mainly on extracting entities, it is generally necessary to exclude predictions for entities. I made the following changes in metric.py: ` for span_true, span_pred in zip(spans_true_batch, spans_pred_batch_top_1): span_pred = span_pred[0] # type: ignore

    i = entity_mapper[span_true.label]
    j = entity_mapper[span_pred.label]  # type: ignore
    
    confusion_matrix_true_denominator[i] += 1
    confusion_matrix_pred_denominator[j] += 1
    if span_true == span_pred:
        ner_confusion_matrix[i, j] += 1
    if i!=0:
        all_true+=1
        if span_true == span_pred:
            all_pred_correct+=1
        if j!=0:
            all_pred+=1

p1=all_pred_correct/all_pred if all_pred_correct!=0 else 0
r1=all_pred_correct/all_true if all_pred_correct!=0 else 0
f1=2*p1*r1/(p1+r1) if p1+r1!=0 else 0
metrics["ner_p"] = p1
metrics["ner_r"] = r1
metrics["ner_f1"]= f1`

ner_f1 indicates "micro_f1" about entity type

Godxia avatar Mar 23 '23 05:03 Godxia