entity-fishing
entity-fishing copied to clipboard
Effect of the different possible parameter combinations for ‘mentions’ in the REST API
The following observations come from the online API
1/When you reverse the order between ‘wikipedia’ and ‘ner’ in the mentions parameter, the result is different. Namely, when ‘ner’ comes second, NER isn’t performed at all. The documentation doesn’t cover this particular constraint.
For the order Wikipedia/ner:
Result with ner first and wikipedia second :
Thanks for reporting this @aa303554. This seems indeed a bug as the order of the processes on which mentions are extracted should not change the results.
I need to look into it, for the time being, keep them in order ["ner", "wikipedia"]
.
mmm it's not a bug, it depends on the order, and it's the expected result. Actually it has to consider the order.
The mentions
field gives the list of "mention recognizers" to be applied successively. If a mention is already recognized by wikipedia, it is not "overwritten" by the NER mention. Similarly if the NER mention is found, the wikipedia one does not apply. In general we must start from the most specific mention recognizer, then finish by the most generic ones, Wikipedia.
This is probably easier to understand when using a specialized mention recognizer like a module to recognize the species name. It has to be applied first because it's the most specific (it already disambiguate the species name, so wikipedia is not as precise). However, the tool has no way to know in advance which one is the most specific, so the order is used. Does it make sense for you?
Ok we need to update the documentation to clarify that.