wink-nlp icon indicating copy to clipboard operation
wink-nlp copied to clipboard

The word AI is classified as the word be during POS tagging.

Open moskaliukua opened this issue 1 year ago • 1 comments

Hi, I have run into one problem in POS tagging. in sentences like: "It is an AI" It seems to be consisten in other sentences as well:

"it made a lot of waves in the AI field." I would expect that the word "AI" is classified as PROPN, but instead I get AUX and lemma is be

import winkNLP from 'wink-nlp';
import model from 'wink-eng-lite-web-model';
const nlp = winkNLP(model);
const doc = nlp.readDoc('It is an AI.').
console.log(doc.tokens().out(its.lemma));
 // [ 'it', 'be', 'an', 'be', '.' ]
doc.printTokens();

token      p-spaces   prefix  suffix  shape   case    nerHint type     normal/pos
———————————————————————————————————————————————————————————————————————————————————————
It                0   It      It      Xx      3       0       word     it / PRON
is                1   is      is      xx      1       0       word     is / AUX
an                1   an      an      xx      1       0       word     an / DET
AI                1   AI      AI      XX      2       0       word     ai / **AUX**
.                 0   .       .       .       0       0       punctuat . / PUNCT


total number of tokens: 5

versions of packages: "wink-eng-lite-web-model": "^1.8.0", "wink-nlp": "^2.3.0",

moskaliukua avatar Aug 21 '24 22:08 moskaliukua