javascript icon indicating copy to clipboard operation
javascript copied to clipboard

Dutch verbs with the suffixes -t/-te/-ten/-de/-den are not stemmed correctly

Open agnieszkaszuba opened this issue 5 years ago • 0 comments

Right now we do not stem the verb suffixes -t/-te/-ten/-de/-den because they are often part of the stem. We should implement three solutions to this problem:

  • If the word ends in -t/-te/-ten/-de/-den, check whether what precedes this ending results in a letter combination that we know can only belong to a noun or adjective. If the word ends in such a combination, we do not stem it.
  • check whether what precedes this ending results in a letter combination that we know can only belong to a conjugated verb (a verb with a suffix). If the word ends in such a combination, we remove the suffixes.
  • In all other cases, we create two stems: one with -t/-te/-ten/-de/-den removed (in case it is a suffix), and one with the ending attached (in case it is part of the stem).

PRs: https://github.com/Yoast/javascript/pull/436 https://github.com/Yoast/YoastSEO.js-premium-configuration/pull/81

agnieszkaszuba avatar Nov 05 '19 14:11 agnieszkaszuba