javascript
javascript copied to clipboard
Dutch verbs with the suffixes -t/-te/-ten/-de/-den are not stemmed correctly
Right now we do not stem the verb suffixes -t/-te/-ten/-de/-den because they are often part of the stem. We should implement three solutions to this problem:
- If the word ends in -t/-te/-ten/-de/-den, check whether what precedes this ending results in a letter combination that we know can only belong to a noun or adjective. If the word ends in such a combination, we do not stem it.
- check whether what precedes this ending results in a letter combination that we know can only belong to a conjugated verb (a verb with a suffix). If the word ends in such a combination, we remove the suffixes.
- In all other cases, we create two stems: one with -t/-te/-ten/-de/-den removed (in case it is a suffix), and one with the ending attached (in case it is part of the stem).
PRs: https://github.com/Yoast/javascript/pull/436 https://github.com/Yoast/YoastSEO.js-premium-configuration/pull/81