nvda NVDA incorrectly identifies word boundaries in Java applications

Steps to reproduce:

Install IntelliJ IDEA 2024.1 or higher https://www.jetbrains.com/idea/download/
Run the Jaccess inspector and select "Track caret property events" checkbox in "Accessibility events" menu
Open any symple java project for example: https://github.com/kranid/test
Open Main.java in the code editor
Ensure that the "When moving by words" option in settings/editor/general is set to the default value, which is "Jump to the current word boundaries."
Press ctrl+g, type "4,13" and press enter to Set the keyboard focus to the first letter of the line: "System.out.print(i);"
Use Ctrl+Right Arrow to move by words
ensure that jaccess inspector correctly identifies the words but NVDA does not.

Actual behavior:

NVDA attempts to identify word boundaries using its own algorithms instead of utilizing the Java Accessibility API to retrieve words provided by the Java application so NVDA and IDEA consider as separate words is various text segments .

caret index	Character	Jaccess inspector	NVDA
120	.	System	System.out.print
121	o	.	System.out.print
124	.	out	System.out.print
125	p	.	System.out.print
130	(	print	(i);

Expected behavior:

NVDA correctly identifies word boundaries in Java applications

NVDA logs, crash dumps and other attachments:

jaccessinspector.log

System configuration

NVDA installed/portable/running from source:

installed

NVDA version:

2024.1

Windows version:

Edition Windows 11 Pro Version 23H2 OS build 22631.3374

Name and version of other software in use when reproducing the issue:

IntelliJ IDEA 2024.1 (Community Edition) Build #IC-241.14494.240, built on March 28, 2024 Runtime version: 17.0.10+8-b1207.12 amd64

Other questions

Does the issue still occur after restarting your computer?

Yes

Have you tried any other versions of NVDA? If so, please report their behaviors.

The behavior does not depend on NVDA version.

If NVDA add-ons are disabled, is your problem still occurring?

Yes

Does the issue still occur after you run the COM Registration Fixing Tool in NVDA's tools menu?

Yes

Apr 08 '24 23:04 kranid

Potentially related #16237

Apr 15 '24 23:04 seanbudd

cc @michaelDCurran @mwhapples for thoughts

Apr 15 '24 23:04 seanbudd

Performing testing with a Braille display I make the following observations. I get similar speech output to that described in this issue. On the Braille display the cursor is updated correctly. So it seems like NVDA is getting the correct caret movement events. My thought though is what precisely should the output be? Looking at the information given in the actual section and how it compares with jaccess inspector and then comparing with similar actions in notepad with NVDA, I am left with multiple options and no definite feeling of which is the "correct" behaviour. It appears the output for jaccess in this issue relates to the text between the previous and current caret position (IE. the text moved over). In notepad NVDA speaks the word the cursor lands on (eg. System . out . println). So why in IntelliJ it speaks more than just the word (IE. System.out.println) I don't know why there is this difference.

Apr 16 '24 08:04 mwhapples

As my previous comment contained a lot of information, for clarity here is my question: Should NVDA speak the word it lands on or the text moved over by the cursor movement event? The former matches NVDA in notepad, the latter matches what has been reported for jaccess inspector.

Apr 16 '24 08:04 mwhapples

@mwhapples, I believe IntelliJ IDEA should define what constitutes a word, at least until NVDA does not manage the cursor and delegates this function to the current application. Users can adjust cursor behavior using IntelliJ IDEA settings to modify word definition. Therefore, I assume NVDA simply needs to announce the words provided by IntelliJ IDEA through the Java Accessibility API; so i sure the JAccess Inspector shows the truly correct words.

Apr 23 '24 21:04 kranid

The thing is that what I observe in jaccess inspector does not seem to report the word I would expect. The problem I observe with jaccess inspector is that if the cursor is placed at the start of a word or may be more precisely put between words, then whilst the current character is reported as the character after the cursor (ie. the one which would be deleted if the user were to press the delete key), the current word is being reported as the word before the cursor (IE. what would be removed when using the backspace key). Thus the reported current character is not part of the reported current word and so it does not seem right to rely upon this part of the API in this case. I need to test whether this is a bug in IntelliJ and what it passes through java access bridge or whether its a deeper java issue (may be access bridge or may be a Java UI toolkit thing). This bug in what is reported by access bridge does not show itself when the cursor is mid word. However the bug is significant as when navigating by word the cursor is placed at the start of a word and so in a position where the bug is present.

Apr 24 '24 08:04 mwhapples

So for this case there does seem to be a bug in IntelliJ not reporting the correct current word when the cursor is at the start of the word. The Java notepad demmo app in the JDK reports what I would have expected, so it does seem to be an IntelliJ specific issue and I will report a bug there. Even when/if that is fixed then NVDA would need to be modified to use this information from the (java Access Bridge.

Apr 24 '24 11:04 mwhapples

@mwhapples, I'd like to draw your attention to the fact that by default, the cursor is positioned at the end rather than the start of a word. It is placed between words or at the start of the next word, but IDEA considers this position as the end of the previous word. For instance, if we take the line " void main()", and the cursor is initially set to the first character of the line, which is a space, pressing Ctrl+Right will move the cursor between "void" and "main". Subsequent Ctrl+Right presses will position the cursor at the left parenthesis. It's worth noting that the left parenthesis is considered a word, meaning the cursor is set to the start of the word following "main." I agree that it's strange when the start of the next word is the end of the previous word, but the JAccess inspector shows the correct word within this framework. This behavior can be altered via IDEA settings, specifically under "Settings/Editor/General/When moving by words". For example, you can adjust the settings so that the cursor is always placed at the start of the word. This adjustment will yield the expected behavior, where the inspector displays the current word, including the current character."

Apr 24 '24 17:04 kranid

I've also reproduced this bug, but I've noticed when selecting text using the control shift arrow keys, NVDA identifies the word boundaries correctly, there might be a difference between the control vs control shift arrow keys handling, I'll check the code to see if I discover anything...

Apr 26 '24 00:04 thgcode

Here's what I discovered so far: it seems the bug is inside the text info, NVDA is constructing its own representation of the text on the JABTextInfo class based on offsets and when expanding by word it uses the offsets it calculated instead of getting from the API Code of the caret post moved script, I checked the code of the accessibility inspector to see how it is displaying the value for the word and it is getting from the getAccessibleTextItems API. Strangely I found this function defined on the JABHandler.py file but it seems NVDA doesn't use this API at the moment...

Apr 26 '24 02:04 thgcode

I've experimented with implementing the JABTextInfo._getWordOffsets method by getting word bounds directly from the JAB API (using JABContext.getAccessibleTextItems method), and it seems to solve the issue quite well. Here's my implementation: https://github.com/nvaccess/nvda/commit/0893c6fc2cbf1130b693695c14a81b3a871b1066.

Words are now read separately as expected, and it also works with different caret behaviors, including the one in IntelliJ where the caret moves to the end of the current word instead of the start of the next word. This is not the final version, as I haven't tested all the edge cases, and also haven't considered the performance yet, but the general approach should stay the same. I would appreciate if you could test this and confirm if this approach looks right. I can then make a pull request with the fix.

May 21 '24 15:05 dmitrii-drobotov

@dmitrii-drobotov This implementation works very well when the caret is set to the first or last symbol of the word. However, if the user sets the NVDA review cursor on another symbol and presses CTRL+NVDA+. (laptop keyboard) to speak the current word, this code will call the old implementation to handle this case, and NVDA will speak the incorrect word.

Jul 12 '24 19:07 kranid

@kranid Thanks for testing, I can also reproduce this issue. I guess we have to go through all possible word locations near the offset, something like the following:

	def _getWordOffsets(self, offset):
		word = self.obj.jabContext.getAccessibleTextItems(offset).word
		wordLen = len(word)
		for i in range(0, wordLen + 1):
			if self.obj.jabContext.getAccessibleTextRange(offset - i, offset + wordLen - i - 1) == word:
				return offset - i, offset + wordLen - i

		return super(JABTextInfo, self)._getWordOffsets(offset)

Could please check this implementation?

Jul 15 '24 16:07 dmitrii-drobotov

@dmitrii-drobotov "I did not discover any real problems with this implementation. I'm a bit uneasy about having to find the offset of a word despite already having the word and wanting to announce this word. So, I tried to find a way to avoid these redundant actions, but all I could come up with was something like this:

	def _getWordOffsets(self, offset):
		word = self.obj.jabContext.getAccessibleTextItems(offset).word
		self.text=word
		wordLen = len(word)
		return 0,wordLen

Jul 19 '24 19:07 kranid

nvda nvda copied to clipboard

NVDA incorrectly identifies word boundaries in Java applications

Steps to reproduce:

Actual behavior:

Expected behavior:

NVDA logs, crash dumps and other attachments:

System configuration

NVDA installed/portable/running from source:

NVDA version:

Windows version:

Name and version of other software in use when reproducing the issue:

Other questions

Does the issue still occur after restarting your computer?

Have you tried any other versions of NVDA? If so, please report their behaviors.

If NVDA add-ons are disabled, is your problem still occurring?

Does the issue still occur after you run the COM Registration Fixing Tool in NVDA's tools menu?

nvda
nvda copied to clipboard