jfx
jfx copied to clipboard
8285893: Hiding dialog and showing new one causes dialog to be frozen
This PR fixes the dialog freeze problem once and for all.
This one is a bit tricky to understand, here is how it works:
This bug happens on every platform, although the implementation of nested event loops differs on every platform.
E.g. on Linux we use gtk_main
and gtk_main_quit
, on Windows and Mac we have an own implementation of a nested event loop (while loop), controlled by a boolean flag.
Funny enough, the reason why this bug happens is always the same: Timing.
- When we hide a dialog,
_leaveNestedEventLoop
is called. - This will call native code to get out of the nested event loop, e.g. on Windows we try to break out of the while loop with a boolean flag, on Linux we call
gtk_main_quit
. - Now, if we immediately open a new dialog, we enter a new nested event loop via
_enterNestedEventLoop
, as a consequence we do not break out of the while loop on Windows (the flag is set back again, the while loop is still running), and we do not return fromgtk_main
on Linux. - And this will result in the Java code never returning and calling
notifyLeftNestedEventLoop
, which we need to recover the UI.
So it is actually not trivial to fix this problem, and we can not really do anything on the Java side. We may can try to wait until one more frame has run so that things will hopefully be right, but that sounds rather hacky.
I therefore analyzed, if we even need to return from _enterNestedEventLoop
. Turns out, we don't need to.
There is a return value which we actually do not use (it does not have any meaning to us, other that it is used inside an assert statement).
~Without the need of a return value, we also do not need to care when _enterNestedEventLoop
is returning - instead we cleanup and call notifyLeftNestedEventLoop
in _leaveNestedEventLoop
, after the native code was called.~
See below for more information. To recover the UI (and in general nested nested event loops ;) we need to set a flag for the InvokeLaterDispatcher
.
Lets see if this is the right approach (for all platforms). Testing appreciated.
- [x] Tested on Windows
- [x] Tested on Linux
- [x] Tested on Mac
- [ ] Tested on iOS (although the nested event loop code is the same as for Mac) (I would appreciate if someone can do this as I have no access to an iOS device)
- [x] Adjust copyright
- [x] Write Systemtest
- [x] Document the new behaviour - in general, there should be more information what to expect
Progress
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
- [ ] Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)
Issue
- JDK-8285893: Hiding dialog and showing new one causes dialog to be frozen (Bug - P3)
Contributors
- Martin Fox
<[email protected]>
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx.git pull/1324/head:pull/1324
$ git checkout pull/1324
Update a local copy of the PR:
$ git checkout pull/1324
$ git pull https://git.openjdk.org/jfx.git pull/1324/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 1324
View PR using the GUI difftool:
$ git pr show -t 1324
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/1324.diff
Webrev
:wave: Welcome back mhanl! A progress list of the required criteria for merging this PR into master
will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
Webrevs
- 08: Full - Incremental (ede84ef1)
- 07: Full - Incremental (ec43f2e3)
- 06: Full - Incremental (bba7fc23)
- 05: Full - Incremental (a63b5254)
- 04: Full - Incremental (0bcda3a2)
- 03: Full - Incremental (508e2569)
- 02: Full - Incremental (a4cce087)
- 01: Full - Incremental (cb8865e8)
- 00: Full (98a97b0a)
Hmm. There was a reason we added a return value to _enterNestedEventLoop
, but I'll take a closer look. Have you run all of the headful tests?
/reviewers 2
@kevinrushforth The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).
@Maran23 Please merge in the latest upstream master, since your branch is out of date.
@Maran23 I see from the description that you tested this on mac. How did you test it? Did you run with -PFULL_TEST=true
?
@Maran23 I see from the description that you tested this on mac. How did you test it? Did you run with
-PFULL_TEST=true
?
No, I checked the example in the ticket as well as the handling of dialogs in general on Mac. Unfortuantely, I always need to ask a colleague to test my JavaFX Builds on Mac, but maybe it's time to setup a VM on my Windows machine. Are there any recommendations I may should follow for the setup? :-)
Regarding the system tests, I checked the nested event loop ones on Windows and they work for me.
I just pushed a change which may fixes this for Mac as well. The change is also logical and is synchronized with the other changes I made to nested event loops. Will ask my colleague as well to test my changes again + run the most obvious (related) system tests on Mac.
I also will do a much bigger testing soon, I have an application that relies heavily on nested event loops for blocking the UI while doing background work.
maybe it's time to setup a VM on my Windows machine. Are there any recommendations I may should follow for the setup? :-)
I don't know of any supported way to run macOS in a VM on a Windows host, but perhaps others will be able to help.
Regarding the system tests, I checked the nested event loop ones on Windows and they work for me.
It worked for me on Windows, too (and on Linux in a headful test run). I would want to make sure that we know why they still work on Windows and Linux and fail on Mac, but I only saw the failures on Mac.
I just pushed a change which may fixes this for Mac as well. The change is also logical and is synchronized with the other changes I made to nested event loops. Will ask my colleague as well to test my changes again + run the most obvious (related) system tests on Mac.
I can also fire off a headful test run this afternoon, and post results tomorrow.
I also will do a much bigger testing soon, I have an application that relies heavily on nested event loops for blocking the UI while doing background work.
The results of that testing would be good information.
I see the same failure with the most recent patch as I did yesterday. I looked over the changes, and I'm not convinced that this is the right approach.
One comment on your proposal:
we also do not need to care when _enterNestedEventLoop is returning - instead we cleanup and call notifyLeftNestedEventLoop in _leaveNestedEventLoop, after the native code was called.
This changes one of the fundamental aspects of entering and exiting a nested event loop. This will need a lot more analysis to show why this is the right approach.
This changes one of the fundamental aspects of entering and exiting a nested event loop. This will need a lot more analysis to show why this is the right approach.
Yes, this is right and I also thought about documenting that as soon as we think that this is the right approach. I completely agree that we need to analyse this. This is not a simple change but will affect Dialogs and every custom code that uses nested event loops (e.g. I used it for blocking the UI without freezing it for data loading purposes, where we want to wait until the code returns in a non UI blicking way).
Maybe there is a better way as well, I just don't figured out one.
As described, right now we rely that enterNestedEventLoop
returns always, which is not happening immediately for ALL platforms, even though Platform has another implementation. Which makes me think the current approach has a flaw, resulting in this behaviour.
I see the same failure with the most recent patch as I did yesterday. I looked over the changes, and I'm not convinced that this is the right approach.
Thanks for testing. As soon as my colleague is available, I will debug this issue. I don't get why only MacOS is failing, since the low level code is pretty much the same, we just continue right after we left the nested event loop. I agree that we should find out the root cause why MacOS behaves different.
I don't know of any supported way to run macOS in a VM on a Windows host, but perhaps others will be able to help.
That was also my understanding the last time I checked. AFAIK, there is no official way from Apple, but VirtualBox may offer that. Will check as soon as I have more time.
The Mac is failing on the simplest NestedEventLoop test, the one that verifies that we can enter and exit a nested loop and correctly retrieve the return value. I started looking into this but find what's happening in Glass a bit confusing. I'll take a closer look in the next day or two.
I don't understand the original failure case. Events aren't blocked and you can type into the modal's text field, it just doesn't repaint. If you resize the window you will see your edits.
I don't think this is a problem with the nested event loop bookkeeping. It looks like a much simpler bug in the invokeLaterDispatcher.
When exitNestedEventLoop is called on the innermost loop the invokeLaterDispatcher suspends operation until the loop finishes. But if you immediately start a new event loop the previous one won't finish and the dispatcher will be wedged. If you're already in a nested loop you can get into the wedged state with just two lines of code:
exitNestedEventLoop(innermost, retVal); enterNestedEventLoop(newLoop);
When the invokeLaterDispatcher is told that the innermost loop is exiting it currently sets leavingNestedEventLoop
to true. When the dispatcher is told that a new event loop has started it is not clearing leavingNestedEventLoop
but it should. Basically it should follow the same logic used in glass; leaving the innermost loop updates a boolean indicating that the loop should exit but if a new loop is started the boolean is set back to a running state since it now applies to the new loop, not the previous one.
I don't think this is a problem with the nested event loop bookkeeping. It looks like a much simpler bug in the invokeLaterDispatcher.
Well, yes and no. My testing was showing that notifyLeftNestedEventLoop
was not even called in the finally block, so there is that. We may still have a problem at the InvokeLaterDispatcher
, although not 100% sure since it worked good for me on Windows.
But I will test it more as soon as I have more time, since I want to really check everything, so I do not miss anything here.
So I could finally look into the issue. Unfortunately, the problem is much deeper than I thought. Consider the following testcase:
Platform.runLater(() -> {
System.out.println("exit1");
Platform.exitNestedEventLoop(this, "123");
Platform.runLater(() -> {
System.out.println("exit2");
Platform.exitNestedEventLoop("this", "456");
});
System.out.println("enter2");
Object o = Platform.enterNestedEventLoop("this");
System.out.println(o);
});
System.out.println("enter1");
Object o = Platform.enterNestedEventLoop(this);
System.out.println(o);
What I would expect:
enter1
exit1
123
enter2
exit2
456
New behaviour:
enter1
exit1
enter2
exit2
456
Old behaviour:
enter1
exit1
enter2
Application hangs (Makes sense, as it is the same logic as if we hide a dialog and show a new one)
So it is better now, since we do not stall the application anymore, but we never return back from the first EventLoop
, since the native code did not return. And I would like to fix this completely.
Maybe, the approach needs to be a little bit different. Something like: If we enter a nested event loop WHILE we are exiting one, we should somewhat wait for it to finish. That is a pretty though one. Ideas welcome.
The first runLater
block is being invoked by the first nested event loop; that loop is on the call stack. In that block you call exitNestedEventLoop
but don't allow the stack to unwind before calling enterNestedEventLoop
. So on the stack the new event loop is indeed nested inside the previous one (the previous one is marked as LEAVING but it's still doing something). Since this is a stack there's only one way to unwind it which is in reverse order. The output I would expect would be:
enter1
exit1
enter2
exit2
456
123
There is that bug in the invokeLaterDispatcher which will prevent the second runLater
block from executing. And the documentation doesn't really address this case but it should.
By the way, when I grab the code in the original bug report and run it agains this PR the innermost modal is not blocked and everything seems fine. But if I press the Close button exceptions start getting thrown. This is happening on both Mac and Windows.
The output I would expect would be:
enter1 exit1 enter2 exit2 456 123
You are right, this sounds correct. EDIT: Just thinking about it again, we technically exited the first loop, so not 100% sure.
By the way, when I grab the code in the original bug report and run it agains this PR the innermost modal is not blocked and everything seems fine. But if I press the Close button exceptions start getting thrown. This is happening on both Mac and Windows.
Can you post the exception? I may know which one and I already fixed one locally, testing it currently.
I still don't have an idea how to solve the problem mentioned above, since the Application.enterNestedEventLoop()
is not returning.
Technically speaking, we have 2 issues here.
-
Application.enterNestedEventLoop()
is not returning -
InvokeLaterDispatcher
mechanism is not perfect as it can lead to stalling the UI.
The top of the call stack when the exception is thrown:
Exception in thread "JavaFX Application Thread" java.lang.IllegalStateException: Not in a nested event loop
at javafx.graphics@23-internal/com.sun.glass.ui.Application.leaveNestedEventLoop(Application.java:534)
at javafx.graphics@23-internal/com.sun.glass.ui.EventLoop.lambda$enter$0(EventLoop.java:122)
at javafx.graphics@23-internal/com.sun.glass.ui.InvokeLaterDispatcher$Future.run(InvokeLaterDispatcher.java:95)
at javafx.graphics@23-internal/com.sun.glass.ui.mac.MacApplication._enterNestedEventLoopImpl(Native Method)
at javafx.graphics@23-internal/com.sun.glass.ui.mac.MacApplication._enterNestedEventLoop(MacApplication.java:169)
at javafx.graphics@23-internal/com.sun.glass.ui.Application.enterNestedEventLoop(Application.java:511)
at javafx.graphics@23-internal/com.sun.glass.ui.EventLoop.enter(EventLoop.java:107)
at javafx.graphics@23-internal/com.sun.javafx.tk.quantum.QuantumToolkit.enterNestedEventLoop(QuantumToolkit.java:656)
at javafx.graphics@23-internal/javafx.stage.Stage.showAndWait(Stage.java:469)
at ApplicationModalSampleApp2$MainFx.showModal(ApplicationModalSampleApp2.java:42)
at ApplicationModalSampleApp2$MainFx.lambda$start$0(ApplicationModalSampleApp2.java:25)
at javafx.base@23-internal/com.sun.javafx.event.CompositeEventHandler.dispatchBubblingEvent(CompositeEventHandler.java:86)
at javafx.base@23-internal/com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:232)
at javafx.base@23-internal/com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:189)
at javafx.base@23-internal/com.sun.javafx.event.CompositeEventDispatcher.dispatchBubblingEvent(CompositeEventDispatcher.java:59)
at javafx.base@23-internal/com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:58)
at javafx.base@23-internal/com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at javafx.base@23-internal/com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at javafx.base@23-internal/com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at javafx.base@23-internal/com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at javafx.base@23-internal/com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at javafx.base@23-internal/com.sun.javafx.event.EventUtil.fireEventImpl(EventUtil.java:74)
at javafx.base@23-internal/com.sun.javafx.event.EventUtil.fireEvent(EventUtil.java:49)
The top of the call stack when the exception is thrown:
Yeah, had the same issue and fixed it locally. Now pushed. Problem as stated above remains though
@Maran23 Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.
I may found a solution which fixes the dialog bug and passes my test above. It behaves exactly like @beldenfox said.
enter1
exit1
enter2
exit2
456
123
The Mac is still failing the NestedEventLoop test.
At the top of InvokeLaterDispatcher.java there's a long explanation on how it's trying to coordinate with the event loop. To do this it needs to have an accurate picture of what's happening at the platform level. In this PR you're telling the dispatcher that an event loop has exited before the platform has actually exited the loop (before enterNestedEventLoop has returned). This is causing the dispatcher to send runLater runnables to the platform earlier than expected. On Windows the runnables will get deferred to the next event loop cycle but on Mac the runnables might get executed in the current event loop cycle. I think this is one of the "native system limitations" mentioned in the comment in InvokeLaterDispatcher.java.
I've created a branch with my proposed fix for this problem (https://github.com/beldenfox/jfx/tree/eventloopjam). The fix prevents the dispatcher from jamming when the event loop it thought was leaving enters a new loop instead. Over in the Mac Glass code I also added a comment with a few more details on what's going on.
The Mac is still failing the NestedEventLoop test.
Thanks for testing this on Mac. I always need to ask someone in order to test the changes, so really appreciating that.
I've created a branch with my proposed fix for this problem (https://github.com/beldenfox/jfx/tree/eventloopjam). The fix prevents the dispatcher from jamming when the event loop it thought was leaving enters a new loop instead. Over in the Mac Glass code I also added a comment with a few more details on what's going on.
I tested your code and it looks good! So this seems to be the better approach here. While mine works for Windows and maybe Linux, it does not for Mac as you also specified.
The return value of the native nested event loops are still not needed, so we can still think about keeping this change, but we should probably revert the EventLoop changes and take your fix instead.
/contributor add @beldenfox
Thanks! I feel like we are coming closer to the fix of this problem. I hopefully have time to test this much more tomorrow - I have at least "booked" myself a time slot for JavaFX. 😄
@Maran23
Contributor Martin Fox <[email protected]>
successfully added.
❗ This change is not yet ready to be integrated. See the Progress checklist in the description for automated requirements.
@Maran23 This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
Please don't close this pull request. I really need this fix.
I just submitted my proposed fix for this issue as PR #1449.
@Maran23 This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
@Maran23 This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open
pull request command.