languagetool icon indicating copy to clipboard operation
languagetool copied to clipboard

Cannot ignore lowerCamelCase words by Hunspell

Open anikitin opened this issue 1 year ago • 3 comments

I am using LanguageTool to verify API documentation. Some documentation fragments include variable names in lower camel case.

I used the option `fsa.dict.speller.ignore-camel-case=true" as suggested in https://dev.languagetool.org/hunspell-support.html. But it seems that it only works for UpperCamelCase words but not for lowerCamelCase.

Example tested via API with LanguageTool 6.3 with fsa.dict.speller.ignore-camel-case=true :

  1. "License blocks are grouped by 'skuId'" --> "Possible spelling mistake found."
  2. "License blocks are grouped by 'SkuId'" --> no matches

anikitin avatar Jan 21 '24 00:01 anikitin

fsa.dict.speller.ignore-camel-case=true

These options are defined in Morfologik, not in LanguageTool.

In LanguageTool, you could add a disambiguation rule in disambiguation.xml for a particular language (or disambiguation-global.xml for all languages). You have to do it in your personal installation. For now, there is no way to make it configurable in LanguageTool, but maybe it would be a good idea to have this option.

The rule could be something like this:

<rule id="IGNORE_LOWER_CAMEL_CASE" name="Ignore lowerCamelCaseWords">
    <pattern>
        <token regexp="yes">\p{Ll}+\p{Lu}.*</token>
    </pattern>
    <disambig action="ignore_spelling"/>
</rule>

jaumeortola avatar Jan 22 '24 12:01 jaumeortola

@jaumeortola , thanks a lot for this workaround. I will give it a try!

anikitin avatar Jan 22 '24 16:01 anikitin

The workaround works. For camel case it ignores all occurrences as expected, for snake case - not always as I mentioned in #10137

I still think that this ticket can be kept open as a future enhancement. Up to you.

anikitin avatar Jan 22 '24 21:01 anikitin