jfx
jfx copied to clipboard
JDK-8269921 Text in Textflow and listeners on bounds can cause endless loop/crash and other performance issues
It's "a bit" complicated. In some situations, getRuns get's called because listeners on bounds are set. This causes TextFlow to layout to compute the runs. Afterward, the bounds of the parents get updated. This triggers a call to compute bounds - which cascades up to the children. When the geometry of the previous Text gets computed in this big stack - it throws an nullpointer. The Text doesn't have its runs, and calling TextFlow.layout is now a noop (it detects repeated calls in the same stack)
In the case it happened - it didn't repair and the application kinda crashed. This bug most likely can also be triggered by ScenicView or similar tools, which sets listeners to the bounds. It also can cause unpredictable performance issues.
Unit test and example stacktrace are in the ticket.
The suggested fix makes sure that recomputing the geometry of the Text, doesn't trigger the layout of the TextFlow. The Textflow should be layouting by the Parent. This might change the behavior in some cases, but as far as I've tested it works without issues in TextFlow Heavy applications.
Benefits:
- Better Tooling Support For ScenicView etc.
- Fixes complicated but reproducible crashes
- Might fix some rare crashes, which are hard to reproduce
- Likely improves performance - might fix some edge cases with unpredictable bad performance
Progress
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
- [ ] Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)
Issue
- JDK-8269921: Text in Textflow and listeners on bounds can cause endless loop/crash and other performance issues
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx pull/564/head:pull/564
$ git checkout pull/564
Update a local copy of the PR:
$ git checkout pull/564
$ git pull https://git.openjdk.org/jfx pull/564/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 564
View PR using the GUI difftool:
$ git pr show -t 564
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/564.diff
:wave: Welcome back fkirmaier! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
/reviewers 2
@kevinrushforth The number of required reviews for this PR is now set to 2 (with at least 1 of role reviewers).
Webrevs
- 05: Full - Incremental (dda1a0c4)
- 04: Full - Incremental (8ab38610)
- 03: Full - Incremental (404f7103)
- 02: Full - Incremental (a9516f3d)
- 01: Full - Incremental (f40b6f6c)
- 00: Full (720905a3)
How to proceed to get this PR reviewed?
I did just run into this issue again, in a totally different application. I guess we have now make a fork also for our desktop applications. (Currently we only for for JPro)
So it would be great to get this into JavaFX, because I'm quite sure it fixes an important issue.
For what it's worth, I agree that updating the bounds or synchronizing peers shouldn't trigger a new layout cycle.
Unfortunately, test coverage is not good for the javafx.scene.text classes, so we won't catch regressions that way.
Do you think that you can come up with a minimal test that shows a different outcome with or without the call to getParent().layout()? If we had that, we could decide with more certainty which behavior we think is more correct.
I got this problem as well today. NPE as runs is null in line 359. Does it make sense to first 'resolve' this by adding a simple null check and later we may try your other change which removes the call to getParent().layout()?
Edit: I also got another interesting NPE in PrismTextLayout:
java.lang.NullPointerException: Cannot read the array length because "this.runs" is null
at com.sun.javafx.text.PrismTextLayout.addTextRun(PrismTextLayout.java:770)
at com.sun.javafx.text.GlyphLayout.addTextRun(GlyphLayout.java:140)
at com.sun.javafx.text.GlyphLayout.breakRuns(GlyphLayout.java:210)
at com.sun.javafx.text.PrismTextLayout.buildRuns(PrismTextLayout.java:785)
at com.sun.javafx.text.PrismTextLayout.layout(PrismTextLayout.java:1036)
at com.sun.javafx.text.PrismTextLayout.ensureLayout(PrismTextLayout.java:223)
at com.sun.javafx.text.PrismTextLayout.getBounds(PrismTextLayout.java:246)
at javafx.scene.text.Text.getLogicalBounds(Text.java:432)
at javafx.scene.text.Text.doComputeGeomBounds(Text.java:1187)
at javafx.scene.text.Text$1.doComputeGeomBounds(Text.java:149)
at com.sun.javafx.scene.shape.TextHelper.computeGeomBoundsImpl(TextHelper.java:90)
at com.sun.javafx.scene.NodeHelper.computeGeomBounds(NodeHelper.java:116)
at javafx.scene.Node.updateGeomBounds(Node.java:3818)
at javafx.scene.Node.getGeomBounds(Node.java:3780)
at javafx.scene.Node.updateBounds(Node.java:776)
at javafx.scene.Parent.updateBounds(Parent.java:1833)
...
<<pulse>>
I've just verified that the test is also green with the call to parent.layout()
So it might be an option, to change this PR to only contain the null check.
When I find some time, I will write a test to show how the parent.layout() call can cause issues. Similar issues exist with Group - which sometimes has the effect, that applications stop working when scenic view is used. (or other tooling)
The exception in PrismTextLayout tends to happen when also the bug of this PR happens - but I don't really know how they relate.
So it might be an option, to change this PR to only contain the null check.
If that is the route you want to go, then filing a new issue / PR is better. A fix with a null check might be a good option as long as we understand why the null is happening (so we can know whether the null check is sufficient).
When I find some time, I will write a test to show how the parent.layout() call can cause issues. Similar issues exist with Group - which sometimes has the effect, that applications stop working when scenic view is used. (or other tooling)
That would be helpful. As it is, this PR is not really ready for review for the reasons I mentioned (much) earlier.
I've just removed the change in the layouting. So I guess the PR should be good now.
FYI: in the near past we have also many of those "crashes" when the UI gets heavily restructured. Crashes in our case means that on Windows for example you get the loading circle but the UI freezes endlessly. I assume it might be related.
Question: does the simple null check help in such a case?
The simple null check helps in this case. But sometimes I see the exception, posted by @Maran23 . So the root cause is probably not fixed - but the symptoms are. I guess, this PR is then a good progress.
Imo the main issue is, that having listeners on the bounds causes a change in the behavior of TextFlow and Group - causing unpredictable behavior.
The simple null check helps in this case. But sometimes I see the exception, posted by @Maran23 . So the root cause is probably not fixed - but the symptoms are. I guess, this PR is then a good progress.
Maybe it's not good progress. You've investigated the issue, and have identified a likely root cause. Not fixing the root cause just means that it most likely won't be fixed for years to come, and the insights you've accumulated will fade.
I think we should just fix the root cause. If some application breaks as a result of this, not much harm is done: we'd gain valuable insight into a real-world situation that depends on the current behavior, and can then decide whether to revert the fix, or provide a viable alternative solution (for users) that accomplishes the same goal.
@mstr2 I see you arguments and your concerns. On the other hand, as a developer affected by this problem, it is different. Costumers are complaining and now it seems to occur even more frequently.
Note: I do not have any voting rights and I do not want to make any claims. Anyhow, I think we all agree that the simple null check solves the issue by hiding it and definitely does not create any regression. Hence, affected applications may get a quick fix and applications are running stable. The actual fix can come later, taking more time, with more sophisticated testing et cetera. I am happy to help there with our application as well. Thanks!
@FlorianKirmaier This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
@FlorianKirmaier this one fell off my radar. The updated fix seems fine as long the check for null runs isn't making some other problem. Btw, the title of the bug doesn't match the problem being fixed, so I recommend changing it to something like "TextFlow: listeners on bounds can throw NPE while computing text bounds".
@FlorianKirmaier This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
@FlorianKirmaier This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.
/open
@FlorianKirmaier This pull request is now open
@kevinrushforth I guess it also fell under my radar too.
I've merged the branch with master! I also changed the title as suggested.
Testing: I've extensively tested the Change since the PR was created. All JPro versions use this bugfix, and a very big Desktop application uses these fixes. So I'm sure this doesn't cause any instability.
@prrace do you want to be the second reviewer?
@FlorianKirmaier This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
@andy-goryachev-oracle Can you be the second reviewer?
@FlorianKirmaier This change now passes all automated pre-integration checks.
ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.
After integration, the commit message for the final commit will be:
8269921: TextFlow: listeners on bounds can throw NPE while computing text bounds
Reviewed-by: kcr, angorya
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.
At the time when this comment was updated there had been 47 new commits pushed to the master branch:
- 0d3341704c1ced4f61b5f7f7618e23ec182ae379: 8320267: WebView crashes on macOS 11 with WebKit 616.1
- d65d8a65414b96ca475d0914b6533cd6e0192984: 8316518: javafx.print.Paper getWidth / getHeight rounds values, causing errors.
- c46c172be65370ef1a121d80b71cfb34cf887d2e: 8320773: [macOS] All IME input blocked
- b80ec391cbba72d84b4b862b3f1b8db2ff8eb6e2: 8313648: JavaFX application continues to show a black screen after graphic card driver crash
- 3548cddedcf3e99af9263efc8004a092a55af8fb: 8320444: Column drag header is positioned wrong for nested columns
- 606878af275dbad551a10653c92a2deef61c10cd: 8318387: Update GStreamer to 1.22.6
- a1036b2691cd0cdaa0e99e9583558ee9dc103692: 8303826: Add FX test for JDK-8252255
- cda623d74a17c1aac034bac175e4fb10bd1fd52b: 8087700: [KeyCombination, Mac] KeyCharacterCombinations behave erratically
- 2e7330417c2f39fee15676966d5b7455bb3fcbff: 8319996: Update to GCC 13.2.0 on Linux
- 59c86bbe2b83a6e76f0dfc357f3197fe7af79e6c: 8303478: DatePicker throws uncatchable exception on tab out from garbled text
- ... and 37 more: https://git.openjdk.org/jfx/compare/5d1254fb1a126ac3d6baff5c5a303ed750d083fa...master
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@kevinrushforth, @andy-goryachev-oracle) but any other Committer may sponsor as well.
➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).
@FlorianKirmaier This PR is ready for you to integrate.
@FlorianKirmaier As a reminder, this PR is ready for you to /integrate
/integrate
@FlorianKirmaier Your change (at version dda1a0c43045a999abb7cdc4cb531a3db0cc856d) is now ready to be sponsored by a Committer.