NER Evaluation

Open hatzel opened this issue 1 year ago • 0 comments

This may relate more to the paper than the tool, sorry if it is the wrong channel.

How exactly is your NER evaluation performed. The relevant section in the paper is as follows:

We only consider the complete matching and use the micro F1 to evaluate NER task. Only when both the border and the type of the predicted entity and the true entity are the same will we regard it as a correct prediction.

Given your example prompts in the appendix only list expected outputs of the form: ["Japan", "LOC"], there is no direct access to a border/offset. Do you just look for the answer text in the original sentence and take that index?

Feb 01 '24 13:02 hatzel