jfx
jfx copied to clipboard
8330590: TextInputControl: previous word fails with Bhojpuri characters
This change replaces Character.isLetterOrDigit(char) which fails with surrogate characters with Character.isLetterOrDigit(int).
Progress
- [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
Issue
- JDK-8330590: TextInputControl: previous word fails with Bhojpuri characters (Bug - P4)
Reviewers
- Karthik P K (@karthikpandelu - Committer)
- Ambarish Rapte (@arapte - Reviewer)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx.git pull/1444/head:pull/1444
$ git checkout pull/1444
Update a local copy of the PR:
$ git checkout pull/1444
$ git pull https://git.openjdk.org/jfx.git pull/1444/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 1444
View PR using the GUI difftool:
$ git pr show -t 1444
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/1444.diff
Webrev
:wave: Welcome back angorya! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
@andy-goryachev-oracle This change now passes all automated pre-integration checks.
ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.
After integration, the commit message for the final commit will be:
8330590: TextInputControl: previous word fails with Bhojpuri characters
Reviewed-by: kpk, arapte
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.
At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.
➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.
@karthikpandelu Can you review this? We'll also need a review by a "R"eviewer.
Is this expected?
I think it might be a bug - even though it's unclear how many words the text "𑂦𑂷𑂔𑂣𑂳𑂩𑂲" contains, I would not expect it to go to the beginning of that segment.
I suspect the code in TextInputControl.endOfNextWord(boolean) is incorrect, and it needs a deeper re-write than the naive replacement with isLetterOrDigit().
I think we need to fix endOf/nextWord as well, as the logic seems to be breaking with the surrogate pairs:
The issue can also be seen with Awadhi: अवधी/औधी
Looking at the "next word" functionality across different applications on different platforms, it appears to be a wide variety of behaviors.
One vendor appears to be quite consistent - Microsoft. Its word, word pad, notepad work exactly the same, with Word working the same across macOS and Win11.
JavaFX TextArea is inconsistent (by design) between macOS and Win11, but also is inconsistent with Swing's JTextArea.
If I were to fix the behavior (if we decide to fix the behavior of the nextWord function, that is), I would make it consistent with MS Word, but let's discuss.
For reference, here is the result of my testing. Initially, the caret is placed at index 0 and the numbers in parentheses denote successive caret positions after ctrl-RIGHT (option-RIGHT) key presses. An underline denotes a space, and a (nl) denotes a newline.
source
_english_english_eng:_end,_eng:_(nl)
(nl)
_eng
BreakIterator.getWordInstance()
_(1)english(2)_(3)english(4)_(5)eng(6):(7)_(8)end(9),(10)_(11)eng(12):(13)_(14)(nl)
(15)(nl)
(16)_(17)eng
text area (mac)
_english(1)_english(2)_eng(3):(4)_end(5),(6)_eng(7):(8)_(nl)
(9)(nl)
(10)_eng(11)
ms word (mac) 16.84 24041420 consistent with win11
_(1)english_(2)english_(3)eng(4):_(5)end(6),_(7)eng(8):_(9)(nl)
(10)(nl)
(11)_(12)eng(13)
text edit (mac)
_english(1)_english(2)_eng(3):_end(4),_eng(5):_(nl)
(nl)
(nl)_eng(6)
chrome (mac) <div contenteditable=true>
english(1)_english(2)_eng(3):(4)_end(5),(6)_eng(7):(8)_<br>
(9)<br>
_(10)eng(11)
eclipse (mac)
_(1)english_(2)english_(3)eng(4):_(5)end(6),_(7)eng(8):_(9)(nl)
(10)(nl)
(11)_(12)eng
JTextArea (mac)
_(1)english_(2)english_(3)eng(4):_(5)end(6),_(7)eng(8):_(9)(nl)
(nl)
_(10)eng
ms word 365 ver 2302 build 16.0.16130.20942 (win 11)
same as notepad (win 11)
same as wordpad (win 11)
_(1)english_(2)english_(3)eng(4):_(5)end(6),_(7)eng(8):_(9)(nl)
(10)(nl)
(11)_(12)eng
TextArea (win11)
_(1)english_(2)english_(3)eng(4):_(5)end(6),_(7)eng(8):_(9)(nl)
(10)(nl)
_(11)eng
@aghaisas would you please take a look at this also?
If I were to fix the behavior (if we decide to fix the behavior of the nextWord function, that is), I would make it consistent with MS Word, but let's discuss.
The behaviour in MS word looks to be easy to understand and what we would expect. +1 for this.
Thanks @andy-goryachev-oracle for checking the behaviour and providing the details.
thank you @karthikpandelu for raising the question!
I've created https://bugs.openjdk.org/browse/JDK-8331951 to deal with the "next word" function issues.
/integrate
Going to push as commit b5fe362286056e516be1d26f5d1cdda12eb20a4c.
Since your change was applied there has been 1 commit pushed to the master branch:
- 9dc4aa2341581d730c9d721e91ac0da081ffddcc: 8324327: ColorPicker shows a white rectangle on clicking on picker
Your commit was automatically rebased without conflicts.
@andy-goryachev-oracle Pushed as commit b5fe362286056e516be1d26f5d1cdda12eb20a4c.
:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.