bert_ner F1, recall and precision calculation

F1, recall and precision calculation

Open rahul-1996 opened this issue 6 years ago • 1 comments

Hi, I was wondering how you are actually calculating your scores.

y_true = np.array([hp.tag2idx[line.split()[1]] for line in open(f, 'r').read().splitlines() if len(line) > 0]) y_pred = np.array([hp.tag2idx[line.split()[2]] for line in open(f, 'r').read().splitlines() if len(line) > 0])

num_proposed = len(y_pred[y_pred>1])
num_correct = (np.logical_and(y_true==y_pred, y_true>1)).astype(np.int).sum()
num_gold = len(y_true[y_true>1])

precision = num_correct / num_proposed
recall = num_correct / num_gold

Can you explain what the above code means? How does this translate to say recall = TP / TP + FN? Don't you have to use some multi-class method?

Also, why are you only taking the index where y_true>1? Is it because you do not want the Other tag to skew your results? Thanks!

Apr 23 '19 18:04 rahul-1996

I think this is a kind of raw since it counts event kind of tag (including I-XXXX) into the computation.

Using the standard evaluation tool is more preferable, such as https://github.com/sighsmile/conlleval/blob/master/conlleval.py.

Apr 24 '19 03:04 JianLiu91

bert_ner bert_ner copied to clipboard

F1, recall and precision calculation

bert_ner
bert_ner copied to clipboard