label-studio-converter
label-studio-converter copied to clipboard
Information lost at export for "visibleWhen" "toName" field
After I updated to v1.2.0, I noticed that the export format has changed. This [1] is my labeling interface config.
The entity_id
field only appears when a text region is selected. This gives the labeler the opportunity to input an entity id or not. The key is in the fact that this field is optional. If it were required, then this would not be an issue (follow below to see why).
Here is a sample of what the export format looked like previously:
{
...
"ner": "Dallas is 7-1-2 in its past 10, and is just two points out of a playoff spot heading into Tuesday night's clash with Carolina.",
"label": [
{ "start": 0, "end": 6, "text": "Dallas", "labels": ["Team"] },
{ "start": 117, "end": 125, "text": "Carolina", "labels": ["Team"] }
],
"entity_id": [
{ "start": 117, "end": 125, "text": ["282"] }
]
...
}
Here is a sample of what it looks like now:
{
...
"ner": "Dallas is 7-1-2 in its past 10, and is just two points out of a playoff spot heading into Tuesday night's clash with Carolina.",
"label": [
{ "start": 0, "end": 6, "text": "Dallas", "labels": ["Team"] },
{ "start": 117, "end": 125, "text": "Carolina", "labels": ["Team"] }
],
"entity_id": [ "282" ]
...
}
Problem
As you may notice, the only difference is in the entity_id
field. At first glance, it might seem like there's no problem, it's simpler now.
However, when you start thinking about how you can link back the entity_id
to the label
, there's no way of doing it other than using the previously available start
and end
fields. Now they are no longer there, there's no way to know whether "282"
refers to the first or the second label
.
This makes it impossible to make use of the additional entity_id
labels.
Potential solutions
- Revert to old export format
- Generate ids for each label and put the same id on the entity id
Note: The problem persists for all export formats - the data for the entity_id
field is insufficient, therefore unusable.
[1]
<View style="display: flex;">
<View style="width: 240px; padding-left: 2em; margin-right: 2em; background: #f1f1f1; border-radius: 3px">
<Labels name="label" toName="text" choice="multiple">
<Label value="Team" background="red"/>
<Label value="Player" background="darkorange"/>
</Labels>
</View>
<View>
<View style="overflow-y: auto">
<Text name="text" value="$ner" saveTextResult="yes" granularity="symbol"/>
</View>
<View>
<View visibleWhen="region-selected">
<Header value="Entity id"/>
<TextArea name="entity_id" toName="text" perRegion="true" maxSubmissions="1"/>
</View>
</View>
</View>
</View>
@alexdevmotion Thank you for your bug report! Is it JSON_MIN format? Could you switch to full JSON?
@makseq Tried the JSON again and indeed it does not seem to have this issue. Seems to be happening with JSON_MIN and CSV.