languagetool LT Edit GUI response enhancement to avoid false negatives by user

Version: 7.3.4.2 (x64) / LibreOffice Community Build ID: 728fec16bd5f605073805c3c9e7c4212a0120dc5 CPU threads: 8; OS: Windows 10.0 Build 19044; UI render: Skia/Raster; VCL: win Locale: en-US (en_US); UI: en-US Calc: CL

Currently testing LT 5.9 snap 20220714 for other things and a longstanding GUI response annoyance came to mind.

A) A contributing but likely not primary factor is the current 'shades of gray' S&G Edit pane pallette. mouse_click on a button results in a GUI response that is just not that visually distinctive. Combined with the following points, this can lead to user doubt regarding if a successful mouse_click as accomplished on a particular button.

B) SETUP: LT presents a complaint in the S&G Edit pane. The user selects/mouse_click on one of the Ignore or Change buttons. There follows a user-perceptible delay in seeing a GUI change of any scale. The selected button may change appearance immediately but given (A) the user may have doubts the mouse_click was successful. The user-perceptible delay grows longer, longer. The user now has enough time to become frustrated and severly doubt the earlier mouse_click was successful, so mash that Ignore or Change button again WAKE-UP WAKE-UP LT!

C) PROBLEM: The systerm and LT finally catch up, and, given the queue of mouse_click events, LT rapidly processes through the next LT Edit complaints in a way the user may not intended, AND, #$%^!!!, LT does not have an UNDO history function to easily recover. The user has to try and remember then find the complaint location before the runaway.

D) A contributing factor is that if the LT pane loses focus for whatever reason, then the first, subsequent mouse_click is a regain_focus event, not a button_select event even though the user did mouse_click visually over a GUI button. The user is not tracking that the LT GUI pane lost focus and so must account for the regin_focus consumption of a mouse_click. But again, given (A), the user may not recognize the lost-focus distinction, the GUI just seems to delay, delay, so the user mashes the buttons WAKE-UP WAKE-UP LT!

HYPOTHESIS: Especially when background processing is enabled, LT, well, a low-level library, issues a re-paint of the selected Ignore/Change button, returns to LT main, then LT main lets loose the hounds of multi-core background processing to find the next complaint quick, quick! The computer fans spin up! In the meantime, the LT GUI is starved of re-paint compute cycles out to user-perceptible delays and what did change -- that Ignore/Change button -- just... did it really change???????

SUGGESTED SOLUTION: A user preference option that the three main text boxes of the Edit pane should get a gray-out treatment or some other sign in the Edit text boxes that very distinctively tells the user that LT is done with that complaint per user instruction, AND LT must assure that GUI re-paint is complete BEFORE substantially loading the system in a way that will starve the GUI re-paint of CPU cycles, AND after painting the next complaint to the GUI will CLEAR the input queue to assure subsequent mouse events are the user's instruction regards the current complaint and not an over-run of queued events from some previous complaint. The idea of this as a user preference is to preserve the historical behavior for those who want it, at least in the near term.

Regards.

Jul 16 '22 17:07 BloomingAzaleas

All Buttons are disabled and all text areas are set to gray, while the dialog is working in the background. Please test it tomorrow (snapshot 20220720).

Jul 19 '22 12:07 FredKruse

There were some changes. Please test snapshot 20220721 tomorrow.

Jul 20 '22 12:07 FredKruse

Testing 5.9 20220720 snap.

This may take a while. The gray-out works well in signalling LT state. However, I have encountered funky behavior in circumstances I am still attempting to narrow down. The LO Writer docs I am using for testing are complex as an LT stressor but also not ones I can share, so I have narrow it down then attempt to reproduce in Lorem Ipsum example docs.

Subject to further investigation the funky behavior is this: LT gets in to a state where I can advance LT through two complaints simply by mouse_click to desktop (LT looses focus), then mouse_click to LT title bar (LT re-gains focus only, no mouse action over any S&G pane buttons!). LT then acts as if it received an Ignore instructruction and advances. LT then acts as expected on the 3rd complaint, that is, does NOT advance from a title-bar mouse_click. Gosh it took a while to figure that out.

Possibly related conditions: The exact sequence of LO operations prior to engaging LT Check Text. In particular: A) If the LO doc has 1 or > 1 embedded external data links for importing text from other doc file. Updating from these links at doc open time can be set by the user for AUTO or ASK_USER. They also can be updated at a later time via Tools->Update->Links. B) If set to ASK_USER (which happens to be my practice), then I get the different LT behaviors if I YES_UPDATE or NO_DO_NOT_UPDATE for the, rabbit-hole-time, 1 external link case versus the > 1 external link case. Have not tested the zero links case yet.

Further symptoms: With one sequence of LO operations prior to engaging LT, LT seems to miss an initial complaint that it catches with a different initial sequence of LO operations. Further, in normal operation LT re-positions the lO document portion shown in the LO window to, in theory, show the text surround of the content of the upper S&G Edit window. In this funky mode, LT is pages off in positioning the LO doc portion.

All of the above suggests two independent confusions are in play: A) LT is receiving the FOCUS_GAINED event but is interpreting it, perhaps via a default case, as a (safe) Ignore_button event. Given the new gray-wash feature, I presume the correct response sequence should be: Receive FOCUS_GAINED event, re-paint GUI to show FOCUS_GAINED state, CLEAR event input queue to assure no queued-event overrun from some prior state, await user instruction via buttons. If FOCUS_LOST event received, re-paint GUI to show FOCUS_LOST state, await FOCUS_GAINED event. However, there is some additional counting or tick-tock effect in play that counts 2 of these cycles then LT starts expected behavior of FOCUS_GAINED is only that. See also https://docs.oracle.com/javase/7/docs/api/java/awt/event/FocusEvent.html

B) LT or LO or both are getting confused by the external link update as to where the LO cursor is and/or the post-external-update contents of the flat paragraph list.

So, I have additional testing to narrow the cases to something reproductible and sharable.

Regards.

Jul 21 '22 19:07 BloomingAzaleas

Have had various personal business and boot stability issues delaying testing. Now testing LT 5.9 snap 20220728 with both my primary, book-style but non-sharable ODT test files and a sharable test ODT consisting of a public domain book. Had to install Calibre and do an EPUB->DOCX->ODT conversion chain to get there. The sharable file currently is not nearly as complicated in LO Writer feature inventory as my non-sharable, so as I find issues in the non-sharble I poke and add suspect LO features complexity to the sharable in an attempt to reproduce.

Behavior has changed from that reported immediately above. I need to work on reproducing it in the sharable. However, a raw report regards the primary non-sharable is:

A) Background processing is ENABLED.

B) Upon ODT open, LO by default places the cursor is placed at the beginning of the 1st para in the document. There follows several pages of what in book publishing would be called "front matter" consisting of:

text boxes for title, author and other meta annotation,
2 LO automatic-object TOCs,
LO frame placeholders for possible graphics
a section of text included RO from another ODT which LT does read and show green-line complaints
more LO automatic-object TOCs

LT does not support the LO frames/text boxes and graphics objects. Understood. However, as we proceed below, these un-supported objects are intermixed with normal-text paras (not hosted in Objects) acting as headers (have LO "Heading" para styles applied) for them and in theory, those normal-text paras should be LT supported and visible in LO. Having said that, LT in the past has behaved as if it either has a heuristic for ignoring headers (as they almost always are incomplete sentences) or maybe LO does not include "Heading" styles paras in that "flat list"?

Finally, the start of normal book-content text paras with header-style paras mixed in.

If, with the cursor at the default position of 1st document para, I LT "Next spelling..." or "Check Text" buttons, I get

2022-07-28-LT-5 9-20220728-snapshot-OOB

LO at this point is: Version: 7.3.4.2 (x64) / LibreOffice Community Build ID: 728fec16bd5f605073805c3c9e7c4212a0120dc5 CPU threads: 8; OS: Windows 10.0 Build 19044; UI render: Skia/Raster; VCL: win Locale: en-US (en_US); UI: en-US Calc: CL

Maybe LO is not going deep enough in the document to find a flat-list qualifying para? Said conversely, maybe my non-sharable has so many LT unspported objects at the head of my document that the default LO lookahead depth never gets to a normal-text para?

OK, restart LO, re-launch the same test ODT, then immediately manually move the cursor deeper in to the document to a multi-sentence normal-text para. Skip over the complicated stuff up front. In the meantime, LT is background processing and green-line LT complaints show in the text as I scroll in to the document.

ANYWHERE I place the cursor in multi-sentence, normal-text paras, including selecting a text block, including just prior to an LT green-line complaint, I get:

For button "Next spelling or grammar..." the popup "Grammar check does not support..."
For button "Check Text" the S&G Edit pane appears and immediately reports "LanguageTool check is complete." This doc is 150+ pages, 64k words, so no LT will not finish a Check Text immediately from near the start of the doc. Not to mention there are green-line complaints in plain sight ahead of the current cursor position.

A right-mouse over a green-line does bring up the LT context menu with appropriate further LT actions.

Button "Refresh Check Results" has no effect. In the past it would sometimes reset LT to a good state.

So, the "Next" button function is confused about the type of the next text object. The S&G Edit pane behaves as if there is no further text to scan through for user fixes.

Having reported the above, I am thinking to await guidance on which of several possible things going on here to reproduce. In the meantime, I will look to generically complicating my sharable doc. Possibly I can replace non-sharable with sharable text in to the structure of my non-sharable doc then sufficiently sanitize the result. Will take some work.

Regards.

Jul 28 '22 21:07 BloomingAzaleas

The exception is solved. But I can't reproduce the reason for the missing function of the check dialog. Could you please send me the sharable text if ready? The LanguageTool.log file of the non-sharable text after the problem appears may also be helpful. Regards.

Jul 29 '22 09:07 FredKruse

Working on the sharable text version. I've decided the best way to go for test replication is take my non-sharable doc and replace the text, sanitize as much as I can find in all the metadata. That way, other than length, the exact structure of LO features and sequence is replicated. Will take a while as I also have to replace the contents of frames/text boxes in the way they are constructed now with LO property variables, the contents of graphics objects and their captions to provide some line items to maintain the existing Table of Illustrations auto-objects, and replace/create header paragraphs to maintain the presence of that feature.

Regards.

Jul 29 '22 13:07 BloomingAzaleas

Attached a sharable text file set in a zip of files including a README. I can reproduce the above reported behavior of LT 5.9 snap 20220728 on the main document AAiW-LT-6928-test-v1.odt. The main doc has two external import links to test that mechanism as well as I have seen variation in bug behavior around that mechanism. The README contains instructions and screenshots on how to set up the files and configure LO to gain doc open-time control of that import mechanism.

LT-6928-test-ODT-set-v1.zip

Regards.

Aug 01 '22 03:08 BloomingAzaleas

@BloomingAzaleas Thank you for the document. I can reproduce the bug, but it is difficult to solve. It will take some days.

Aug 03 '22 08:08 FredKruse

@FredKruse The choice of document content was no accident. Fred's Adventures in Wonderland.

Updated main doc with additional formatting elements to increase doc complexity + a few random footnotes to assure an index table at the end has entries.

AAiW-LT-6928-test-v2.odt

Aug 03 '22 12:08 BloomingAzaleas

The so-called text frames wasn’t supported by LT until now. This was the reason for failing to check your document. This missing feature is added now. I also added a check for protected content, like the embedded texts in your document. Protected areas are not checked any longer by LT. Please test the next nightly 20220805.

Aug 05 '22 15:08 FredKruse

Will test.

LO Writer offers 2 "text enclosure" objects:

Texts boxes -- an older concept of relatively limited functionality, conceptually inherited from MSO. Text boxes offer more control of text placement in a doc or on a page. The interior text is not subject to the surrounding text re-flow. A way of assuring the interior text stays together regardless of surrounding re-flow, i.e., assure all stanzas of a poem appear on the same page. OTOH, because of limited features, simpler to use compared to FRAMES.

Frames -- effectively a Writer sub-document. Plus other goodies including linking multiple frames such that they form an independent text re-flow domain as if a single frame.

https://ask.libreoffice.org/t/frames-vs-text-boxes/20847/3

It is OK for LT to not support various object types as long as that is documented and which object by office suite (Google, LO, MSO, OO, etc.), which gets back to the lack of doc on the LT Website. However, unsupported should mean ignored for LT complaint scanning. When a Next Complaint or Check Text is launched from a supported text location, encountering an unsupported object should mean somehow skipping past it. When Next Complaint or Check Text is launched within an unsupported object, then the Unsupported Object error popup is appropriate.

Note that the test doc started with an empty paragraph, thus a doc open or Recheck Document will always start there. LO Writer (and MSO Word) by default use paragraphs to anchor non-paragraph objects, though other anchor options are offered (to page, to character, as character). This means sometimes empty paragraphs are present simply to act as anchors for objects because the paragraph marks are subject to re-flow of the preceding text. Manual (forced) page breaks also anchor to the immediately preceding paragraph mark. Another peculiarity of LO writer is that if one inserts two index objects sequentially WITHOUT pre-positioning at least one paragraph, even an empty one, between the index objects, it is then later impossible to insert a paragraph between them -- you have to delete the trailing index object, insert a paragraph mark, then re-create the index object. Thus in the text doc you will find index objects followed by at least one, typically empty, paragraph mark as a way to future proof.

Regards.

Aug 05 '22 19:08 BloomingAzaleas

Tested LT 5.9 20220805 snap for all the prior reported behaviors, including false complaint advances. Tested external doc link update or no update cases. Had one anomaly where LT appeared to step in to a Frame (frame32 in the LO Navigator, pg 19 "Fury said to"), scan it, and bring up an S&G complaint, but then I could not reproduce it from a clean LO launch. Got an S&G scan all the way to the end. It did S&G scan the Footnotes (as the last paras in the doc) and moved the cursor appropriately, so good I put some in.

ALL GOOD for testing to date for this enhancement functionality and other bugs that were discovered and cleared along the way. Now that it all seems to work, a user-confusion GUI consistency item became more intrusive.

Place the cursor in a Frame, then the LT Next button and get the "Grammar check doesn't support..." error msg box as expected. Not a supported object. Repeat but use Check Text instead and get the S&G pane up immediately showing "LanguageTool check is complete." and the Result bar at 100%.

Yes, from a programming POV, the cursor inside an unsupported object is functionally equivalent to a zero text paragraph or End of Doc condition. I suggest from the user's POV its all the same -- visually the cursor is in an unsupported object, thus the user should get the same GUI response from any LT action button that behaves in other contexts as if starting a text walk from the current cursor position. Both the Next and Check Text buttons should throw the unsupported object error msg box. Not being consistent in that response confuses and mis-trains the user on the concept of unsupported objects, then specifically as to what is or is not an unsupported object in the LO Writer GUI. If, visually, there remains text in the doc beyond the cursor position, beyond the unsupported object, that LT would walk normally if only the cursor was moved out of the unsupported object, then the "LanguageTool check is complete." response rings visually false, suggesting breakage in LT.

Alternatively, if LT DID support text boxes/frames/linked_frames as conceptual sub-documents eligible for S&G, then a "check complete" within a sub-doc would make sense. The user would just have to be trained to that sub-doc concept.

Regards.

Aug 05 '22 21:08 BloomingAzaleas

ARGHHHH!

In testing withLanguageTool-5.9-20220805-snapshot and LanguageTool-5.9-20220806-snapshot with other docs, I have managed to reproduce the Next and Check Text behavior similar to that reported in https://github.com/languagetool-org/languagetool/issues/6928#issuecomment-1198660415 of:

ANYWHERE I place the cursor in multi-sentence, normal-text paras, including selecting a text block, including just prior to an LT green-line complaint, I get: For button "Next spelling or grammar..." the popup "Grammar check does not support..." For button "Check Text" the S&G Edit pane appears and immediately reports "LanguageTool check is complete." This doc is 150+ pages, 64k words, so no LT will not finish a Check Text immediately from near the start of the doc. Not to mention there are green-line complaints in plain sight ahead of the current cursor position.

A right-mouse over a green-line does bring up the LT context menu with appropriate further LT actions.

PLUS if background processing is disabled in the Options pane, then:

Check Next results in an S&G pane that stalls, or no pane at all
Next can cause a hang of the doc (Windows whiteout).

These two snaps DO work as expected with the Wonderland test doc I submitted, and the Wonderland doc was created by copy-past textual replacement in the same doc template as the docs that are failing now were created. All the LT unsupported object types are the same and in approximately the same order and same count, though the Wonderland docs are half the length. At present I have no clue as to the functional diff for what LT sees.

Sooooooo... I will have to do something I have tried to avoid and do a recursive bisect of the failing docs. There are multiple ways to do this (bisect at the middle of the doc, or divide 25%, 50%, 25% then extract/delete the middle 50%, etc.) which could yield different results.

Regards.

Aug 07 '22 19:08 BloomingAzaleas

There were still some problems with view cursor handling. Possibly, that causes the fails. Please test it with tomorrow snapshot 20220809.

Aug 08 '22 09:08 FredKruse

Yes. I managed to bisect down to the overt issue in one of my non-sharable docs. An hypothesis came to me after I shut down for the night. I need to work re-creating the issue in a sharable doc. If my hypothesis is correct, it is a subtle, indirect bug having nothing to do with this enhancement per se. The proximate buggy behavior is a symptom of something more fundamental for LT, which in turn explains why appears/disappears so erratically. But yes, the cursor plays a prominent role. Likely I will close this enhancement as achieved and open a new issue.

Regards.

Aug 08 '22 12:08 BloomingAzaleas

Progress report here since I do not have completely conclusive before-after demo examples for opening a new issue. Still testing with LanguageTool-5.9-20220806-snapshot so I can establish a firm understanding of behaviors and hypothesis of fails before updating to more recent snap levels.

Bisecting my non-sharable doc, I can get to a minimal case, but when I apply my hypothesis to the Wonderland document I get similar but not identical LT fail conditions. The process is very tedious as each bisect cut-away step has to be labeled and preserved as a file so a before-after demo set is created. Every test has to be a complete shutdown of LO then scratch re-launch of LO and the test doc from a double-click to assure LO and LT are starting from the same state. Having said that, LO does have a Tools->Option->General Quickstarter option for maintaining a background LO service daemon, and in any case, once launched, I think LO retains that daemon presence. So after first launch, LO is never entirely a scratch application launch.

Then, given some of what I have observed (below), there are 2 routes: A) Repeated small cuts and variation experiments in complex docs to isolate, and B) docs built from scratch using as many Writer defaults as possible to reproduce hypothesizes from A. I have not yet achieved a convergence of A and B, suggesting further dependencies to be identified.

Items:

I test by always starting with Recheck Document as a way to assure a consistent LT startup. I define an LT FAIL as Recheck Document S&G immediately reporting Check Complete and Result 100% even though the test doc deliberately has several different types of complaint items LT should find. I define LT SUCCESS as Recheck Document S&G text walking in to the document and complaining as expected. I &Ignore my way to the end of the document to observe other behaviors, but see below a different behavior I stumbled on.
I test with background-processing disabled on the theory this lets me see S&G complaints at the time LT detects and throws them, whereas background-mode might asynchronously create fails far away from the view cursor, so then I have no idea.

I have noticed but have not probed in detail the following along the way:

With background-processing disabled, the LT S&G scan can get in to a state where it appears to be operating from a cache that has not been updated from &Change actions applied through S&G, much less manual changes by the user via the Writer GUI window. That is, a subsequent Recheck Document in the same document session reports the prior text and complaint in S&G even though the complaint was corrected away in the prior Recheck pass. As if LT foreground-mode is not subscribed to or if subscribed, not processing LO text update events to refresh its view of the doc text. However, I mentioned in earlier days of this issue that the Writer cursor and LT's idea of where that cursor is lose sync, and I have seen application of &Change land in the wrong place, which I as a user might not notice because I am looking at the S&G pane, not the Writer GUI window, leaving the original complaint text for re-detection by a re-scan.
The hypothesis I am chasing has a firm dependency on Update Links (re-importing) external document sections in a master doc at master doc open time. With LO Options set to ask that update question at open time, for the exact same test doc file, YES promotes LT FAIL, NO promotes LT SUCCESS. I do not know the details of Extension initialization in relation to an Update Links operation at doc open time, or later via Tools->Update->Links. Two broad cases are: A) Update YES changes some subtlety of how Writer presents doc text to an Extension (possibly as a bug in Writer?), and B) Extensions that can and do init prior to an Update Links event retain state data made stale by the doc-open Update Links event.
Despite, in theory, LT not supporting LO text boxes and frames, Recheck Document S&E scan does step in to them to some extent and will detect and throw complaints about text_box/frame interior text. Conversely, manually placing the view cursor at the start of text in said objects then Check Text gives an immediate LT FAIL.

Regards.

Aug 10 '22 16:08 BloomingAzaleas

I made some modifications to the dialog. Lt does no spell check to protected words anymore. If you click on 'next error' frames are not grammar checked (to jump to an error, that isn't marked makes no sense. Grammar check in frames is not supported by LO).

Aug 17 '22 09:08 FredKruse

Rudely, my pocket universe of LT bug chasing has been trampled aside by life-issues such as a waterfall roof leak and re-stocking the pantry. I have been forced to develop a spreadsheet and other infrastructure practice to track test cases. Hope to resume pace later this week.

Regards text frames in LO (in the Wonderland doc), more specifically I observed A) an LT spell check complaint, and B) in a different frame, an LT smart quotes complaint on the first word of a frame, though no complaints about obvious issues further in to the frame text, as if LT was only looking at the first few words. Is the smart quotes complaint considered spelling or a style/grammar complaint? I know there is a smart quotes style rule somewhere.

FYI, if you are not aware, however the flat paragraph list is constructed with LO Writer, both text frames and then footnotes+endnotes (call them append-notes as a class) get processed by LT AFTER all other text blocks in the document, NOT in document visibility order. For example, by constructing footnotes containing deliberate LT complaints, a Re-check Document (and the view cursor) walk the doc text from page 1 as expected but skipping past frames and footnotes to the end page, then the S&G view jumps back in to the doc and walks the frames then footnotes.

Regards "protected words," now I am unsure as to what this phrase means in LT-in-LO-land. Most literally in LO, these would be Writer section objects (external doc links for import objects) marked in their definition as immutable, vice defined as being mutable with changes written back to the parent document as SAVE time. Next more inclusively would be the Writer automatic objects of inserted fields and indexes that are maintained by Writer. Index objects cannot be directly edited from the Writer GUI. Previously, LT could/did modify them but as pointed out in #6927, this would be deceptive since those changes would be over-written when next Writer updated those objects.

Maybe by "protected words" you mean text in unsupported objects that LT was peeking at anyway? Writer frames do offer a content protection (enforced read-only) property that is disabled by default, whereas section objects are write-protected by default.

Regards.

Aug 17 '22 16:08 BloomingAzaleas

Found it. A combo of static structural sequence in an LO Writer doc AND a specific dynamic sequence of user actions with the doc that nominally have nothing to do with LT. The result is LT Recheck Doc reports scan complete NO COMPLAINTS as if nothing found when there clearly are things to complain about. With a different user dynamic sequence, LT behaves as expected.

I need to construct a tight set of demo case files, test some additional edge cases that come to mind, then also test against the most recent LT snap and LO release. Will close this issue when I open the new one.

Regards.

Sep 03 '22 04:09 BloomingAzaleas

This LT GUI enhancement request completed.

Sep 12 '22 12:09 BloomingAzaleas

languagetool languagetool copied to clipboard

LT Edit GUI response enhancement to avoid false negatives by user

languagetool
languagetool copied to clipboard