morphodict
morphodict copied to clipboard
Fix glitches in "inflected" English phrase translation
The English phrase generation of some forms does not work for some of the English definitions, which needs to be fixed in the generator FSTs:
Verbs:
- [x] various accented characters missing from the set of accepted input characters
- [x] occurrence of
it, when that is not the actual object in the phrase (often parenthesized), e.g. s/he finishes (it/him) for s.o.; s/he tans (it/him) for s.o., or otherwise whenitis the (implicit or explicit) object and thus neither an element that should be inflected, e.g. He shines a light on it. - [x] object marker
s.o.in parentheses, indicating an implicit object that should not be inflected, e.g. s/he has (s.o. as) a mother-in-law - [x] reflexive object him/herself, most likely a former code glitch concerning all but
1Sgand2Sgforms, e.g. s/he gives him/herself a difficult time, s/he makes things difficult for him/herself, s/he is very tough on him/herself - [ ] plural nouns ending in
-sthat could subsequently be analyzed verbs, if rule-based analysis is applied , e.g. powers in s/he is released, s/he is let go by the powers, s/he is set down by the powers; s/he is permitted by the powers - [x] use of 12 for inclusive first person plural forms, instead of 21.
Nouns
- [x] various accented characters missing from the set of accepted input characters
- [x] plural suffix turning up in (too many) wrong places
In addition, there appear to be some extra-FST glitches:
- [x] swap crk analyses with
12to21- this may only apply to the object cases, i.e.+12PlO->21PlO+, e.g.Prt+4Sg/Pl+12PlO+ s/he raises s.o. in poverty; s/he raises s.o. as an orphanwhich does not generate, in contrast toPrt+4Sg/Pl+21PlO+ s/he raises s.o. in poverty; s/he raises s.o. as an orphanwhich does generate s/he/they raised you and us in poverty; s/he/they raised you and us as an orphan. We might want to check this for the possessors as well, which in generation should bePx21Pl+instead ofPx12Pl+. - [ ] Subject already converted to he/she rather than the CW original s/he, e.g.
Imm+2Pl+ he/she shines a light on itorPrt+21Pl+4Sg/PlO+ he/she finishes (it/him) for him/her/them; he/she tans (it/him) for him/her/them:
Once the above matters are resolved, we go down from 512,210 non-generated forms to only some 43,493 missing ones, cf.
cat inc/phrases/verbs.phrases | grep '+?' | grep -v 'him/herself' | grep -v '(s.o. ' | egrep -v '\<it\>' | grep -v 12 | grep -v 'he/she' | grep -v 4 | wc -l
43493
For noun phrases, most/many are not properly constructed with an initial feature, e.g.
cat inc/phrases/nouns.phrases| grep '+?' | head -10
Piegan country, in the Piegan country +?
small piece of cloth, scrap +?
domestic animal +?
shorts; underwear +?
crab; lobster +?
birthday cake +?
my vagina, my vulva +?
cucumber; literally: our deceased grandmother +?
Shoal Lake Cree Nation, SK; Cree reserve +?
intestine
The rest appear to be cases with diacritic characters, which now ought to be fixed.