morphodict icon indicating copy to clipboard operation
morphodict copied to clipboard

Miscellaneous search ranking issues

Open aarppe opened this issue 3 years ago • 2 comments

This is an issue for collecting possible glitches in search ranking, which may be due to an error in the formula or the supporting FSTs, or something else:

  • [ ] Seaching my bears bear +N+Px1Sg+Pl receives zero POS-match for maskwa, though one should receive at least 0.333, cf. image image

  • [ ] my bear does not receive the POS code +N (this is a bug in the phrase analyzer FST). image

  • [ ] missing cosine vector distance value and/or morpheme ranking value: image

aarppe avatar Jun 21 '22 20:06 aarppe

Addition: nîpin now appears on CW both as VII and NI, in two entries. If this is correct, there's a question of whether noun or verb should get priority, all other things being equal.

fbanados avatar Jun 24 '24 19:06 fbanados

Yes, there are a few words that are used both as nouns and verbs, usually expressions of time, such as for day, cf. https://itwewina.altlab.app/search?q=k%C3%AEsik%C3%A2w

For ranking in search, this should be based on whatever comes out of the ranking algorithm, based on the various frequencies of lemmas, morphemes, and glossary occurrence (as those should distinguish between noun and verb readings).

aarppe avatar Jun 25 '24 16:06 aarppe