label-studio-converter
label-studio-converter copied to clipboard
Span shift in the exported data
the downloaded annotated files have a span shift, the start-end of a label doesn't match the correct word in the document
Could you provide more details please?
- What is your labeling config?
- What is you task data with this problem?
<View display="inline">
<View style="position:sticky; position: -webkit-sticky;
top: 0; background-color: #a6e0f4; z-index: 2;
border-radius: 5px;">
<Labels name="Attributes" toName="text" showInline="true">
<Label alias="negation" hotkey="N" value="negation" background="#d9534f"></Label>
<Label alias="frequency" hotkey="F" value="frequency" background="#2bbbad"></Label>
<Label alias="severity" hotkey="V" value="severity" background="#5cb0de"></Label>
<Label alias="change" hotkey="C" value="change" background="#aa66cc"></Label>
<Label alias="neuropathy" hotkey="I" value="neuropathy" background="#808080"></Label>
<Label alias="diarrhea" hotkey="D" value="diarrhea" background="#ffdf4f"></Label>
</Labels>
</View>
<View style="width:100%; z-index: 1;">
<Text name="text" value="$r_TEXT"></Text>
</View>
</View>
- NER task: In the exported file; [span start- end] label text [ 3322- 3332] Neuropathy | neuropathy [ 8456- 8464] frequency | r assess [ 8477- 8485] Diarrhea | tinues t
text: Undergone neuropathy with .....r assess of having a conditional frequency.... there is a sudden span shift from 8456 and this is observed in 10% of the exported files.
Do you have special characters in your texts (smiles or some other extra symbols)? Are you on windows (it has \n\r for new line and it could affect the correct position of exported words)?
No smiles but I do have bullet points, I am on a Mac
Try to leave only one bullet, 2-3 words in text and check the offset in the exported file.