pharo
pharo copied to clipboard
Add an "exit condition" to the UI loop to allow a former UI process t…
…o finish its last cycle and terminate automatically when replaced by a new UI. This change could allow UI processes blocked on custom semaphores to continue waiting for a signal and finish their previous job before terminating automatically. Useful for keeping the UI responsive during evaluation of examples like this:
s := Semaphore new. [1/0. s signal] fork. s wait
This is a first approxiamation of the solution and may require some polishing :)
failing test not related:
- SpJobListPresenterTest>>testJobIsFinishedWhenWaitingMoreThanWorkBlockDuration
Hi Denis,
Well, increase the priority of the forked process and PR will not help:
[1/0] forkAt: Processor activePriority + 1. [ 10000 factorial. 5 seconds asDelay wait ] repeat
Perfect example :)
I understand now that when the [1/0] process requests a new debugger the UI is waiting on a ProcessList, i.e. not on a Semaphore, and yet we'd like the UI to open a new debugger. It's clear then, as you suggested, that distinguishing where exactly the UI is waiting is not helpful.
I'd like to suggest the simplest possible solution here: Always create a new UI process for a new debug request. This way you'll make sure the current UI will be able to finish whatever it is doing (a Playground example or any deferred action) and automatically terminate when it reaches the end of its current cycle. And at the same time the new UI will be able to open the requested debugger immediately and start acting as a proper new UI.
An alternative could be trying to distinguish when the UI is idly waiting (on a ProcessList between cycles or an intercycle delay semaphore) or busy running it's intracycle errands (e.g. by placing a boolean flag inside the cycle's code) but I can't see any added value here.
So unless I missed some fatal downside I prefer the former, simple solution.
It means that the same code started in the workspace may occasionally show a different behavior. I would prefer to always have a same one. Otherwise it is a breaking brain moment.
Can't agree more :) It seems to me though "any" code fragment should now produce identical outcome regardless whether it's run in the UI (Playground) or otherwise (forked) ("any" code meaning, of course, except e.g. a code referring to the UI process itself :) etc. ). Or at least we're getting closer to the "ideal".
Surprisingly, even the following example works, i.e. opens a debugger instead of getting stuck, but in this case it is also thanks to some unrelated system interrupts at play preempting the old UI and allowing the new UI run..
[1/0] fork. [ 10 factorial ] repeat
Please check the latest commit. I'd very much appreciate your view.
failing test in CI: CoNarrowHistoryFetcherTest>>testNarrowingReturnsSameElementsThatCallingDirectly ... no idea how could be related The same test OK in the image
@isCzech I like how simple it is now. Thanks for this work. I would just add tests for isBlocked method.
CoNarrowHistoryFetcherTest is not related. I have another PR with same errors.
Hi, I was checking this, long discussion. Can somebody do an executive summary? @isCzech @dionisiydk do you think this is good enough to integrate?
Hi @guillep . Summary: PR is a general solution to the following issue:
s := Semaphore new.
[1/0. s signal] fork.
s wait
Execute this code from the playground and the image will be locked. The debugger from the background process will not be shown. Another example without semaphore:
s := Semaphore new.
[1/0. s signal] fork.
[] repeat.
The explanation is quite simple: to open the debugger the UI process should be "free" to process the "debugger request". If the UI process is busy on something then any new UI action (like a debugger request) will wait until the UI process will be free.
The proposed solution is quite elegant:
- debugger request should always start new UI process.
- old UI process is left to be completed by itself (using "stop condition" for its UI loop)
Does it make sense?
@isCzech could you resolve conflicts?
Hi @dionisiydk I wonder why has the conflict popped up but I have no experience in this regard. IMO nothing changed in the original Pharo code and the suggestion in this PR is still valid. May I ask you to doublecheck? I trust you have lot of experience :)
Hi @dionisiydk I wonder why has the conflict popped up but I have no experience in this regard. IMO nothing changed in the original Pharo code and the suggestion in this PR is still valid. May I ask you to doublecheck? I trust you have lot of experience :)
It seems you did everything. Tests look unrelated. I guess any PR is failing now
I restarted the build on CI and there are failing tests.
SystemDependenciesTest>>#testExternalMorphicCoreDependencies
looks now broken.
Also, while, at first sight, it does not look related, we have around 10 other tests failing. See https://ci.inria.fr/pharo-ci-jenkins2/blue/organizations/jenkins/Test%20pending%20pull%20request%20and%20branch%20Pipeline/detail/PR-11455/8/tests/ for more details.
Does the change pass all image tests locally on a fresh P11 image?
If so, maybe these changes can be rebased on top of last P11 (for which CI is green: https://ci.inria.fr/pharo-ci-jenkins2/job/Test%20pending%20pull%20request%20and%20branch%20Pipeline/job/Pharo11/)
I updated the base to be Pharo 12
I am pretty familiar with Dolphin Smalltalk's UI code, and it indeed uses a very similar core approach of exiting the message loop when that particular process is no longer the One True UI Process, allowing the new one to take over smoothly. A neat bit of convergent evolution!
However I have one concern: Message processing in Dolphin is much more fine-grained, as it can rely on Windows to do most of the heavy lifting WRT determining need-to-paint—so there's no internal concept of a render cycle, it just processes WM_PAINT
messages as they are posted by the OS. As such, even with multiple UI processes in play, there is a reasonable expectation that messages will be handled in order, and a given need for painting will be satisfied only once. In Pharo, I would be concerned about the tail-end of a previous UI process clashing with the new one—WorldState>>doOneCycleFor:
does a lot of work before returning to the check in doOneCycleWhile:
(and yeah, there are several calls in between these two, but if there's only one World it looks like a straight shot—if there's more than one, the situation only gets worse). The equivalent in Dolphin handles one mouse/keyboard event OR paints one Win32 "window"/Morph-analogue.
This actually would already cause problems with the existing case where a new UI process is spawned, that of debugging the UI process, except that there is an awful hack of a workaround already in place. In DebugSession>>terminate
:
self interruptedProcess
ifNotNil: [
"Assume the user closed the debugger. Simply kill the interrupted process."
self interruptedProcess terminate.
self clear.
Smalltalk installLowSpaceWatcher. "restart low space handler" ]
ifNil: [
"Assume the interrupted process was resumed."
"Kill the active process if the error was in the UI as there should be only one UI process."
self isAboutUIProcess
ifTrue: [Processor terminateActive] ]
It looks like interruptedProcess
is nil'ed when hitting "Resume"? But going by the comments, this is the (only!) reason that debug-then-resume'ing the UI process doesn't permanently leave two UI processes running and fighting over the event queue, painting, etc. But...okay, I thought this would mean that you couldn't resume a DoIt, because it would kill the process right away. For some reason it doesn't, and I don't have time to chase down why. What does happen is that the debugger window remains open until the DoIt finishes, which is just weird—shouldn't it close just as or even before the process resumes execution? But in that case it clearly would end up killing the process. So there's some definite improvement possible here once UI processes can cooperate better.
I'm not sure how to address this cleanly without a need for massive refactoring, or indeed at all—I know too little about how Pharo interacts with the host OS to be confident. I have some vague ideas about how to tease things out if anyone is interested, but it'd be easiest to chat about it on Discord or something rather than try to write it up all at once here.
(Another difference that's worth noting and may be relevant—in Dolphin the UI process blocks entirely waiting for new events from the OS, relying on the idle process to take over and quiesce the VM. Pharo seems instead to explicitly poll for events every ~16ms and execute a (potential) render cycle each time. Clearly there is dirty-checking as idle CPU usage is minimal, but the image still wakes and executes at least some code each cycle.)