NeuroNER
NeuroNER copied to clipboard
Add handling of discontinuous annotations (brat >= 1.3).
This PR addresses the issue described in #36 when brat_to_conll.py
encounters discontinuous annotations created by brat >= 1.3. These can be created unintentionally by including a newline in the span of an annotation, or manually ("Add Frag").
I implemented two possible behaviors. A discontinuous annotation can either be split into multiple annotations (one for each fragment) or joined into an expanded annotation that starts with the first fragment and ends with the last. For examples see test_brat_to_conll.py
.
The choice is controlled by a new parameter split_discontinuous
. Its default is False
, i.e., joining, because of the case where discontinuous annotations are unintentional.
Is this change works also for convert from conll to brat or it isn't neccesary to change conll_to_brat file?
It isn't necessary. The issue is only with brat to conll.
Ok thank you for help. Do you have an example of the output?. Also do you have the code with the changes?. If so i really appreciate if you can share it with me. Thank you.
Sure, the new tests demonstrate the changes. If you're running into problems with discontinuous annotations and need this fix now, you could clone my fork. It's up to date at the moment.
Hi James. I wonder if you can help with this problem. My problem is that i have annotations between or inner other annotations for example:
T2 SCOPE 53 69 with no dementia T3 NEGATION 58 60 no T4 DISABILITY 61 69 dementia
or
T3 SCOPE 1420 1455 not dependent on others for walking T4 NEGATION 1420 1423 not T5 DISABILITY 1424 1455 dependent on others for walking
I think i could manage like disconitunous annotations but i don't know if this is the best option. When i use the original brat_to_conll file it always kept with the first annotation in this case with the scope annotation. Do you know how manage this kind of inner annotations?. Really appreciate your help. Thank you.
Sorry, I haven't looked into options for handling overlapping annotations.
Thanks for your help James. By the way is it possible to identify interaction between entities with neuroner given a brat annotation? like:
T39 disease 72 82 carcinomas T56 body-part 61 71 colorectal R1 relatedTo Arg1:T39 Arg2:T56
Thank you
Hi James.
Just a quick question, do you know if neuroner use some kind of padding for character embedding?
Hope you can help me with this.
Whatever happened to this PR? I'm trying to load a bunch of brat annotation files with discontinuous annotations. Is @jamesdunham 's fork still the only option and is the master branch ahead of it in other ways?