zeyrek
zeyrek copied to clipboard
empty result added after unknown word
While analyzing a sentene, when an unknown word is encountered, an empty parse array appears after it. For example:
text = "Mahjong oynamayı biliyor musun?"
analyzer = zeyrek.MorphAnalyzer()
analyzer.analyze(text)
returns the following results:
[Parse(word='Mahjong', lemma='Unk', pos='Unk', morphemes='Unk', formatted='Unk')]
[]
[Parse(word='oynamayı', lemma='oynamak', pos='Noun', morphemes=['Verb', 'Inf2', 'Noun', 'A3sg', 'Acc'], formatted='[oynamak:Verb] oyna:Verb|ma:Inf2→Noun+A3sg+yı:Acc')]
[Parse(word='biliyor', lemma='bilmek', pos='Verb', morphemes=['Verb', 'Prog1', 'A3sg'], formatted='[bilmek:Verb] bil:Verb+iyor:Prog1+A3sg'), Parse(word='biliyor', lemma='bilemek', pos='Verb', morphemes=['Verb', 'Prog1', 'A3sg'], formatted='[bilemek:Verb] bil:Verb+iyor:Prog1+A3sg')]
[Parse(word='musun', lemma='mu', pos='Ques', morphemes=['Ques', 'Pres', 'A2sg'], formatted='[mu:Ques] mu:Ques+Pres+sun:A2sg'), Parse(word='musun', lemma='Mu', pos='Verb', morphemes=['Noun', 'A3sg', 'Zero', 'Verb', 'Pres', 'A2sg'], formatted='[Mu:Noun,Abbrv] mu:Noun+A3sg|Zero→Verb+Pres+sun:A2sg')]
[Parse(word='?', lemma='?', pos='Punc', morphemes=['Punc'], formatted='[?:Punc] ?:Punc')]