Bio-Epidemiology-NER
Bio-Epidemiology-NER copied to clipboard
Breaking word while labelling
Hey, Kudos for the amazing work on biomedical ner. Really awesome how good it is. But sometimes it breaks a word into multiple tokens and labels them which is kinda weird. Can we stop the model from doing that?
eg :
{
"entity_group": "Administration",
"score": 0.46949705481529236,
"word": "thor",
"start": 424,
"end": 428
},
{
"entity_group": "Medication",
"score": 0.7422544360160828,
"word": "##ugh",
"start": 428,
"end": 431
}
Hi Parth, Thanks for your review. I am currently working on it, will update once this is done.
Hi Parth, Thanks for your review. I am currently working on it, will update once this is done.
Hi Deepak, this seems to be still an issue, at least on the Huggingface version of the model. Is there an update on it?
Hi. Thanks for the model. Great work! I'm seeing the same here (^=break): An^esthesia, arthros^copic I'm looking at the training files now and will let you know if I find the reason.