cspell icon indicating copy to clipboard operation
cspell copied to clipboard

Q: Is it possible to support per-language prefixes -a.k.a. do not flag words with prefixes like "un" or "pre" or "co"?

Open klonos opened this issue 1 year ago • 5 comments

The following are some examples that CSpell flags as wrong.

  • unflag/unflagged
  • uninstantiatable (well, that one might be flagged because instantiatable is flagged, but it is a valid word: https://en.wiktionary.org/wiki/instantiatable)
  • unrendered
  • ... etc.

Words like the above make sense in coding, although there may not be officially included in any dictionary, so they should not be flagged. Instead of adding individual such words in the various dictionaries, which is impractical and kinda silly, I believe that cspell should follow the following logic:

  • follow the same logic as currently
  • if a word is flagged as wrong, check if it begins with "un" before actually flagging it
  • exclude the "un" and treat the remainder of the word as the word to check, and check again
  • if the rest of the word checks out, then so does the one with the "un" prefix

I believe that there should be a way to define a list of such prefixes (and perhaps suffixes as well?), and that this configuration should be per language.

klonos avatar Feb 16 '24 00:02 klonos

@klonos,

Thank you for the suggestion.

Doing what you suggest is not technically hard, but it is hard to ensure it is correct.

For example, unred, cogreen, pretree, would all be considered correct by just prefixing words with un, co, or pre. In the case of unred it is a common spelling mistake for unread.

The dictionaries are pretty extensive. They consist of hundreds of thousands or even millions of words. Very large dictionaries are not an issue. The time to look up a word is based upon the length of a word, not the size of the dictionary. So adding unrendered to a dictionary isn't really and issue. It just takes a bit of time to add it to a word list.

Jason3S avatar Feb 20 '24 09:02 Jason3S

One more: unoptimized

ADTC avatar Mar 31 '24 20:03 ADTC

@ADTC,

Please feel free to create a PR to add unoptimized and other words that you think are appropriate to the US English Dictionary

Jason3S avatar Apr 01 '24 17:04 Jason3S

I just filed a PR that adds some words that start with the "un" prefix: https://github.com/streetsidesoftware/cspell-dicts/pull/3129

Also adds instantiatable and uninstantiatable, as well as unoptimized suggested by @ADTC.

klonos avatar Apr 25 '24 01:04 klonos

Thank you for picking that one up. :)

ADTC avatar Apr 25 '24 02:04 ADTC