cspell-dicts icon indicating copy to clipboard operation
cspell-dicts copied to clipboard

Alternate German Spellings

Open Jason3S opened this issue 3 years ago • 3 comments

thank you for all your effort! May I ask you to have a look at the German dictionary, too?

To achieve utf8 compliance it is common in German to substitute umlauts:

  • Ä -> Ae
  • Ö -> Oe
  • Ü -> Ue
  • ä -> ae
  • ö -> oe
  • ü -> ue

Currently my solution is to add all words I encounter to a custom dictionary, but it would be great to have these substitutions build-in.

Cheers,
Arne

Originally posted by @ar-std in https://github.com/streetsidesoftware/vscode-cspell-dict-extensions/issues/12#issuecomment-1153812578

Jason3S avatar Jun 13 '22 16:06 Jason3S

@ar-std,

I'm guess that these are approved alternate spellings of German words. Do you have a reference?

Jason3S avatar Jun 13 '22 16:06 Jason3S

I'm not sure, if there is any 'approved reference'. I think it is just common use (mainly for data processing, as these special characters are not available on most keyboards/encodings). On Wikipedia there is only a link to some Oracle reference that explains how they use it...

I would like to help with this, but I'm not at all familiar with how cspell works. Is it enough to iterate over the *.dic file with some reg-exp magic, duplicate every line with umlauts and replace the umlauts in the duplicate line? Or can that be handled more easily and globally by some settings?

Additionally I forgot one mapping (and some additions):

  • Ä -> Ae (AE in all-caps)
  • Ö -> Oe (OE in all-caps)
  • Ü -> Ue (UE in all-caps)
  • ä -> ae
  • ö -> oe
  • ü -> ue
  • ß -> ss

ar-std avatar Jun 29 '22 10:06 ar-std

@ar-std,

Thank you for the help.

Since the .dic and .aff files come from an external source, I think it is better to just copy them (src/hunspell/index.(aff/dic)) to a new directory in src, keeping only the impacted words in the .dic file. We can then create a substitution rule that will replace Ä with Ae and the rest. Words with ß will need to be duplicated and replaced with ss.

Is this related to #603?

Jason3S avatar Jun 29 '22 12:06 Jason3S