this-word-does-not-exist
this-word-does-not-exist copied to clipboard
Updated approx_pos method in dataset.py
Updated ParsedDictionaryDefinitionDataset approx_pos method in dataset.py.
Something must've changed in how a stanza.models.common.doc.Word
is structured, causing the method def approx_pos(cls, nlp, sentence, lookup_idx, lookup_len):
to fail.
The Word object now looks something like:
{
"id": 6,
"text": "a",
"upos": "DET",
"xpos": "DT",
"feats": "Definite=Ind|PronType=Art",
"start_char": 23,
"end_char": 24
}
The plus side of this is that the start_char
and end_char
can now be extracted without using regex.
I've tested the change in Google Colab.