SkillNER
SkillNER copied to clipboard
Make text cleaning optional.
Is your feature request related to a problem? Please describe. The cleaning of the text makes it impossible to link annotated spans to the character indices of the original text. This in turn makes it impossible to compare the performance of this model to other ner models.
Describe the solution you'd like
Make the text cleaning step optional. When the cleaning step is omitted, then abv_text
== immutable_text
.
Describe alternatives you've considered Provide additional metadata containing the start and end character indices of each annotated span linked to the original text rather in addition to the boundaries linked to the cleaned text
You could instantiate your own empty skillNer.cleaner.Cleaner to bypass text cleaning. However you also want to protect abv_text from later processing, which would require some changes in the code.