stroke-input-android icon indicating copy to clipboard operation
stroke-input-android copied to clipboard

English Keyboard Auto Suggestion For Words

Open luinniuxx opened this issue 2 months ago • 1 comments

I know this mostly focuses on Chinese stroke, but English is commonly used and type. Is there any way we can implement auto-suggestion?

Some other open-source projects for English auto-suggestion:

https://github.com/futo-org/android-keyboard

https://github.com/AnySoftKeyboard/AnySoftKeyboard

Thanks,

luinniuxx avatar Oct 07 '25 22:10 luinniuxx

I am hesitant to add English suggestions, both from a technical point of view and from a philosophical point of view.

Technical

With Chinese, the algorithms for (1) determining the candidate characters (given a partial stroke sequence) and (2) determining candidate phrase completions (given the characters before the cursor) are simply based on deterministic prefix matching (implemented with a TreeMap and a TreeSet loaded into memory).

This is an extremely rudimentary solution, but due to the nature of Chinese, it seems to be (barely) usable. Occasionally I get complaints on Google Play about the keyboard not learning from user input. But that's an explicitly stated feature, not a bug.

With English, candidates are determined by some notion of edit distance. This seems (to me) a lot harder to implement (technically) than prefix matching. Moreover, what constitutes a good suggestion/autocorrect algorithm is a notoriously difficult problem:

  • https://github.com/AnySoftKeyboard/AnySoftKeyboard/issues/4422
  • https://github.com/AnySoftKeyboard/AnySoftKeyboard/issues/2821
  • https://github.com/futo-org/android-keyboard/issues/522
  • https://github.com/futo-org/android-keyboard/issues/300

Philosophical

As seen above with https://github.com/futo-org/android-keyboard/issues/300, one of the contentious issues with English suggestions is whether to omit words that are offensive. I can think of some words that English keyboards with suggestion/autocorrect would probably steer the user away from typing.

I do not believe in censoring my users. You might argue that I could please everyone by implementing an option to toggle whether offensive words are censored from the candidates. For me, the issue is not whether there is a toggle, but the fact that censorship could occur. I am not one for being an arbiter of what is offensive, in English or in Chinese.

You might also argue that a lot of offensive phrases are absent from my phrases-traditional.txt and phrases-simplified.txt. But at the character level, there is complete coverage of Unified Ideographs in the Basic Multilingual Plane. If a user wants to type an offensive character (or a character that completes an offensive phrase), and has started off with enough of the correct sequence of strokes, the desired character will appear as a candidate. No exceptions.

yawnoc avatar Oct 08 '25 17:10 yawnoc