ontobio
ontobio copied to clipboard
"category" field in Token should be "categories"
In ontobio.model.nlp
(https://github.com/biolink/ontobio/blob/master/ontobio/model/nlp.py#L19) the field category
is always empty, since SciGraph appears to the return the field as categories
. I can't find precisely where or when in the SciGraph commit history the field was named categories
, but you can see from this query that the returned field name is currently categories
:
curl -X POST "https://scigraph-ontology.monarchinitiative.org/scigraph/annotations/entities" -H "accept: application/json" -H "content-type: application/x-www-form-urlencoded" -d "content=male&minLength=4&longestOnly=false&includeAbbrev=false&includeAcronym=false&includeNumbers=false"
The result being:
[
{
"token": {
"id": "UBERON:0003101",
"categories": [
"anatomical entity"
],
"terms": [
"male organism"
]
},
"start": 0,
"end": 4
},
{
"token": {
"id": "PATO:0000384",
"categories": [
"quality"
],
"terms": [
"male"
]
},
"start": 0,
"end": 4
},
{
"token": {
"id": "WBbt:0007850",
"categories": [
"anatomical entity"
],
"terms": [
"male"
]
},
"start": 0,
"end": 4
}
]
If this is in fact a mislabeled field, both the field in ontobio.model.nlp.Token
and the field in biolink.datamodel.serializers
(https://github.com/biolink/biolink-api/blob/master/biolink/datamodel/serializers.py#L336) will need to be corrected.
(This issue comes from investigating https://github.com/biolink/biolink-api/issues/387.)