morphodict icon indicating copy to clipboard operation
morphodict copied to clipboard

Fix glitches in "inflected" English phrase translation

Open aarppe opened this issue 4 years ago • 3 comments

The English phrase generation of some forms does not work for some of the English definitions, which needs to be fixed in the generator FSTs:

Verbs:

  • [x] various accented characters missing from the set of accepted input characters
  • [x] occurrence of it, when that is not the actual object in the phrase (often parenthesized), e.g. s/he finishes (it/him) for s.o.; s/he tans (it/him) for s.o., or otherwise when it is the (implicit or explicit) object and thus neither an element that should be inflected, e.g. He shines a light on it.
  • [x] object marker s.o. in parentheses, indicating an implicit object that should not be inflected, e.g. s/he has (s.o. as) a mother-in-law
  • [x] reflexive object him/herself, most likely a former code glitch concerning all but 1Sg and 2Sg forms, e.g. s/he gives him/herself a difficult time, s/he makes things difficult for him/herself, s/he is very tough on him/herself
  • [ ] plural nouns ending in -s that could subsequently be analyzed verbs, if rule-based analysis is applied , e.g. powers in s/he is released, s/he is let go by the powers, s/he is set down by the powers; s/he is permitted by the powers
  • [x] use of 12 for inclusive first person plural forms, instead of 21.

Nouns

  • [x] various accented characters missing from the set of accepted input characters
  • [x] plural suffix turning up in (too many) wrong places

aarppe avatar Mar 06 '21 21:03 aarppe

In addition, there appear to be some extra-FST glitches:

  • [x] swap crk analyses with 12 to 21 - this may only apply to the object cases, i.e. +12PlO -> 21PlO+, e.g. Prt+4Sg/Pl+12PlO+ s/he raises s.o. in poverty; s/he raises s.o. as an orphan which does not generate, in contrast to Prt+4Sg/Pl+21PlO+ s/he raises s.o. in poverty; s/he raises s.o. as an orphan which does generate s/he/they raised you and us in poverty; s/he/they raised you and us as an orphan. We might want to check this for the possessors as well, which in generation should be Px21Pl+ instead of Px12Pl+.
  • [ ] Subject already converted to he/she rather than the CW original s/he, e.g. Imm+2Pl+ he/she shines a light on it or Prt+21Pl+4Sg/PlO+ he/she finishes (it/him) for him/her/them; he/she tans (it/him) for him/her/them:

aarppe avatar Mar 10 '21 19:03 aarppe

Once the above matters are resolved, we go down from 512,210 non-generated forms to only some 43,493 missing ones, cf.

cat inc/phrases/verbs.phrases | grep '+?' | grep -v 'him/herself' | grep -v '(s.o. ' | egrep -v '\<it\>' | grep -v 12 | grep -v 'he/she' | grep -v 4 | wc -l        
   43493

aarppe avatar Mar 11 '21 03:03 aarppe

For noun phrases, most/many are not properly constructed with an initial feature, e.g.

cat inc/phrases/nouns.phrases| grep '+?' | head -10
 Piegan country, in the Piegan country	+?
 small piece of cloth, scrap	+?
 domestic animal	+?
 shorts; underwear	+?
 crab; lobster	+?
 birthday cake	+?
 my vagina, my vulva	+?
 cucumber; literally: our deceased grandmother	+?
 Shoal Lake Cree Nation, SK; Cree reserve	+?
 intestine

The rest appear to be cases with diacritic characters, which now ought to be fixed.

aarppe avatar Mar 28 '21 08:03 aarppe