doccano-transformer
doccano-transformer copied to clipboard
to_conll2003 "B-"flag mark in the wrong place
How to reproduce the behaviour
{"id": 19523, "text": "\"颜姑娘。\"易左古不懂颜幼韶所道万福是什么意思。", "meta": {}, "annotation_approver": null, "labels": [[6, 9, "SPEAKER"]]}
when to_conll2003, the B-SPEAKER is not corresponding to 易, it is in the place before 易 and is "
solution:
I change the function "create_bio_tags" in the "utils.py",let it be as below:
#if i >= n or token_end < labels[i][0]:
if i >= n or token_end <= labels[i][0]:
Your Environment
- Operating System: MacOS 11.0.1
- Python Version Used: 3.7.3
- doccano-transformer Version: 1.0.2