HeliBoard icon indicating copy to clipboard operation
HeliBoard copied to clipboard

Korean input issues

Open eylenburg opened this issue 2 years ago • 25 comments

This is a continuation of #99

  1. Adding letters at the beginning of a word adds them at the end: https://github.com/Helium314/openboard/issues/99#issuecomment-1722560283

  2. Backspace doesn't work correctly while typing a word: https://github.com/Helium314/openboard/issues/99#issuecomment-1723312570

  3. Syllables disappear when typing a word that requires a "shift" letter (e.g. ㅆ = shift + ㅅ ): https://github.com/Helium314/openboard/issues/99#issuecomment-1741980218

These three issues seem to be related. The third one seems to be the most severe because there's no work around.

eylenburg avatar Oct 16 '23 09:10 eylenburg

2 and 3 are fixed, 1 will require looking into the hangul combiner

Helium314 avatar Oct 20 '23 15:10 Helium314

Not sure if this is a regression or not since I don't see it mentioned, every time I try to add some punctuation after a word, it deletes that word but not all punctuation does this. Periods and commas always delete the previous word, currently I switch to the English keyboard and do the punctuation and its fine.

calhix avatar Nov 01 '23 08:11 calhix

Not sure if this is a regression or not since I don't see it mentioned, every time I try to add some punctuation after a word, it deletes that word but not all punctuation does this. Periods and commas always delete the previous word, currently I switch to the English keyboard and do the punctuation and its fine.

Yes I have noticed the same. If you put a space before the comma or period it's fine, but directly after and it will delete the word for some reason.

eylenburg avatar Nov 24 '23 16:11 eylenburg

Thank you for your continued support for the Korean language! Let me know if you need testing.

joopdo avatar Mar 02 '24 11:03 joopdo

What is really needed is someone willing and able to play with HangulCombiner, I suspect the issue is in there. InputLogicTest.insertLetterIntoWordHangul can be used to reproduce the issue (the test currently fails).

Helium314 avatar Mar 03 '24 14:03 Helium314

Does the test call the HangulCombiner().processEvent() function anywhere? If I wanted to add that function, should the previousEvents array consist of all the previous characters inputted? (Edit: I mean also including space and such NonHangul)

sohamshanbhag avatar Aug 04 '24 09:08 sohamshanbhag

@Helium314 I could look into this if you add the appropriate tests. I don't have android studio but I can run the tests using gradle.

sohamshanbhag avatar Aug 28 '24 10:08 sohamshanbhag

I don't have android studio but I can run the tests using gradle.

But you can download Android Studio? Or add/modify the tests yourself... I'm not sure which tests you expect me to add though.

Does the test call the HangulCombiner().processEvent() function anywhere?

The function is called on every code input, the combiner is selected depending on keyboard script. So it's called in the test.

Helium314 avatar Aug 28 '24 16:08 Helium314

But you can download Android Studio?

I tried to but my laptop is old and so underpowered that android studio runs too slow for anything.

Or add/modify the tests yourself... I'm not sure which tests you expect me to add though.

I was looking at the same location, but insertLetterIntoWordHangul calls input (line 620), which doesn't do anything if currentScript is ScriptUtils.SCRIPT_HANGUL.

I tried to modify it as

if (currentScript == ScriptUtils.SCRIPT_HANGUL) {
    val oldCodePoints = oldBefore.map { getHangulEvents(it) }.flatten()
    val oldEvents = ArrayList( oldCodePoints.map { Event.createEventForCodePointFromUnknownSource(it) } )
    val insertEvent = Event.createEventForCodePointFromUnknownSource(codePoint)
    val text = HangulCombiner().processEvent(oldEvents, insertEvent)
}

but I'm not sure if you want to handle words or sentences, i.e. with spaces in between. (I've written getHangulEvents as a function which return the individual parts of a word from the word (i.e. 각 -> ㄱ ㅏ ㄱ).

If you could add the correct test there, I could try to modify processEvents such that it will correctly combine the words. Presently, processEvents is not called (which can also be checked by adding a println to processEvents).

sohamshanbhag avatar Aug 28 '24 17:08 sohamshanbhag

I was looking at the same location, but insertLetterIntoWordHangul calls input (line 620), which doesn't do anything if currentScript is ScriptUtils.SCRIPT_HANGUL.

        latinIME.onEvent(Event.createEventForCodePointFromUnknownSource(codePoint))
        handleMessages()

is called independent of the script, and that's the only interaction with the keyboard (latinIME). What is skipped for hangul is checking whether codePoint character was added to the text. Since the hangul combiner doesn't simply append characters, the check doesn't make sense and will obviously falsely fail in many cases.

Helium314 avatar Aug 28 '24 18:08 Helium314

I see. The problem is when you call mInputLogic.onCodeInput in LatinIME.java on line 1507, mKeyboardSwitcher.getCurrentKeyboardScript() is ScriptUtils.Latin. So processEvent of HangulDecoder is never called. I'm not sure how to correct the same, seems like setting currentScript in InputLogicTest doesn't change the value. Could you correct the same?

Edit: Also, removing the (event.getMCodePoint() >= 0x1100) check allows punctuation to pass to HangulDecoder.processEvent where it is handled correctly.

sohamshanbhag avatar Aug 29 '24 12:08 sohamshanbhag

Right, the test actually fails to reproduce the correct state for hangul. It was broken since the script was switched from some scriptId to using ISO_15924 strings.

Thanks for finding this! Fixed now, I hope it helps

Helium314 avatar Aug 30 '24 18:08 Helium314

What's the status the punctuation issue? I still have this issue in 2.3

iopq avatar Mar 06 '25 10:03 iopq

A small change that I'd like to see made in the Korean input is that if you long press a Korean letter for it to primarily go to the shift character rather than the symbol character.

For instance, the key for ㅂ shows ㅃ in top-right, but when you long-press ㅂ, it primarily selects % and the user must move to the right to select ㅃ.

Also, punctuation eats whatever I'm typing in Korean. And if I switch from Korean to English after this happens, punctuation eats the first word that I write.

slycordinator avatar Mar 15 '25 00:03 slycordinator

@Helium314 I have written a Korean combiner that does not suffer from any of these 3 issues. It only supports Dubeolsik layout, but I think 99% of Korean phone users use Dubeolsik so it should not be a problem.

I don't know the exact workings of the Korean code in Heliboard, but seems like key Events in Hangul script are fed to a HangulEventDecoder which sends them to HangulCombiner which combines them. The KoreanCombiner I have is different, it does not need HangulEventDecoder and directly takes Events.

Please do look into integrating this code into Heliboard.

tenextractor avatar May 16 '25 08:05 tenextractor

FUTO license doesn't look compatible with GPL 3.0, so Heliboard can't use FUTO code and FUTO can't use Heliboard code.

Helium314 avatar May 23 '25 19:05 Helium314

I have confirmed with the FUTO Keyboard developer, it's my contribution so I can just contribute the same code to you. It's not a problem.

tenextractor avatar May 26 '25 04:05 tenextractor

Thanks!

Helium314 avatar May 26 '25 17:05 Helium314

Emojis are transformed into invalid characters. I'm guessing this comes from the same behavior of HangulCombiner but I'm not familiar enough with the sources to confirm it.

I'm not sure if GitHub will keep the characters as is but here are some examples of smileys going through the Korean keyboard:

藍 🤣
 😄 

Sayrus avatar May 27 '25 12:05 Sayrus