spacy-stanza icon indicating copy to clipboard operation
spacy-stanza copied to clipboard

Unknown morphological feature: 'ConjType'

Open TahaMunir1 opened this issue 4 years ago • 1 comments

When I run nlp(comment) for Urdu language, I am getting error: [E167] Unknown morphological feature: 'ConjType' (9141427322507498425). This can happen if the tagger was trained with a different set of morphological features. If you're using a pretrained model, make sure that your models are up to date: python -m spacy validate Some of the docs work while some don't.

To Reproduce Following code to get tokens and pos tags:

snlp = stanza.Pipeline(lang='ur') 
nlp = StanzaLanguage(snlp) 
doc = nlp('یہ سرد اور تلخ تھا')

Windows and CentOs Python3.8 Stanza version: 1.0.0

TahaMunir1 avatar May 11 '20 12:05 TahaMunir1

Sorry, some of the tag maps haven't been tested well for unsupported morphological features, in particular for languages where spacy doesn't have provided models, since we don't train a tagger internally and catch this error in the tag map.

Try using v2.3.0, which has an updated tag map for Urdu. If you want to use v2.2 or an older version, you can also just edit the tag map in your installation (under spacy/lang/ur/tag_map.py) to remove any of the unsupported morphological features like "ConjType": "coor", which I think is the unsupported feature here.

adrianeboyd avatar Jun 25 '20 12:06 adrianeboyd

Just going through some older issues...

I think this was resolved in spacy v2.3, or at the very latest in spacy v3.

But please feel free to reopen if you're still running into issues!

adrianeboyd avatar Oct 09 '23 14:10 adrianeboyd