morphodict
morphodict copied to clipboard
Modified edit-distance matching missing some edits or has suboptimal weighting algorithm
The following is a selected set of test strings for which the weighted edit matches sometimes work and sometimes not [+: expected behavior; -: unexpected behavior; *: fixed - UPDATED on 11.1.2023]:
+ mitsiw -> + mîciw
* micow -> * mîciw
* mitsow -> * mîciw
+ neeyu -> + niya, + niyâ
* neeyuh -> * niya, * niyâ
+ nigi-nidawi-wapamaw -> + nikî-nitawi-wâpamâw
* nigi-nidawi-wabamaw -> * nikî-nitawi-wâpamâw
* nibaw -> * nipâw
* nohte -> * nôhtê-
* nohte- -> * nôhtê-
+ mitâs -> + mitâs
* nitâs -> + nitâs, * mitâs [not linked to lemma `mitâs`]
* nitas -> - mitâs, * nitâs [neither result]
* mitas -> * mitâs
Also, in principle, the edit-weighting should be such that the following ranking should result:
ewapamat -> ê-wâpamat < ê-wâpamât (1 edit less)
Here's most of the above being run through and recognized by a descriptive FST with a weighted spell-relax:
hfst-lookup -q ../../inc/crk-anl-desc-w.hfst
nitas
nitas mitâs+N+A+D+Px1Sg+Sg 0.250000
nitas mitâs+N+I+D+Px1Sg+Sg 0.250000
nitas nitâs+N+A+D+Px1Sg+Sg 0.250000
nitas nitâs+N+I+D+Px1Sg+Sg 0.250000
mitas
mitas mitâs+N+A+D+PxX+Sg 0.250000
mitas mitâs+N+I+D+PxX+Sg 0.250000
mitas mihtâtêw+V+TA+Imp+Imm+2Sg+3SgO 0.750000
mitsiw
mitsiw mîciw+V+TI+Ind+Prs+3Sg 0.750000
micow
micow mîciw+V+TI+Ind+Prs+3Sg 0.750000
mitsow
mitsow mîciw+V+TI+Ind+Prs+3Sg 1.250000
neeyu
neeyu niya+Pron+Pers+1Sg 0.000000
neeyu niyâ+Ipc 0.000000
neeyuh
neeyuh niya+Pron+Pers+1Sg 0.000000
neeyuh niyâ+Ipc 0.000000
nigi-nitawi-wapamaw
nigi-nitawi-wapamaw PV/nitawi+wâpamêw+V+TA+Ind+Prt+1Sg+3SgO 0.750000
mitâs
mitâs mitâs+N+A+D+PxX+Sg 0.000000
mitâs mitâs+N+I+D+PxX+Sg 0.000000
mitâs mihtâtêw+V+TA+Imp+Imm+2Sg+3SgO 0.500000
nitâs
nitâs mitâs+N+A+D+Px1Sg+Sg 0.000000
nitâs mitâs+N+I+D+Px1Sg+Sg 0.000000
nitâs nitâs+N+A+D+Px1Sg+Sg 0.000000
nitâs nitâs+N+I+D+Px1Sg+Sg 0.000000
nitas
nitas mitâs+N+A+D+Px1Sg+Sg 0.250000
nitas mitâs+N+I+D+Px1Sg+Sg 0.250000
nitas nitâs+N+A+D+Px1Sg+Sg 0.250000
nitas nitâs+N+I+D+Px1Sg+Sg 0.250000
mitas
mitas mitâs+N+A+D+PxX+Sg 0.250000
mitas mitâs+N+I+D+PxX+Sg 0.250000
mitas mihtâtêw+V+TA+Imp+Imm+2Sg+3SgO 0.750000
The above errors appear to have gotten resolved.