pharo icon indicating copy to clipboard operation
pharo copied to clipboard

Process>>terminate - fix failing test, correctly terminate non-local returns in ensure blocks and nested terminations etc.

Open isCzech opened this issue 2 years ago • 12 comments

I'd like to extend my work on #terminate (see #8993 recently merged) to fix failing testIsTerminatingForcedTermination (currently as expected failure) and further deficiencies like correctly terminate non-local returns in ensure blocks, correctly terminate nested errors in ensure blocks and nested terminations, and allow to safely debug BlockCannotReturn scenarios.

E.g. the following examples will be evaluated correctly and safely:

[[[] ensure: [^2]] ensure: [^42]] fork
"any debugging possible unless pressed Proceed"
[
	[
		[ ] ensure: [
			[self error: 'outer error'] ensure: [
				self error: 'inner error'.
				'x1' trace]. 
			'x2' trace]
	] ensure: [
		'x3' trace].
	'x4' trace
] fork.
Processor yield.
"should print x1 x2 x3"
[
	[ ] ensure: [
		[self error: 'unwind test'] ensure: [
			^'x1' trace]. 
		'x2' trace]
] ensure: [
	'x3' trace].
'x4' trace
"should print x1 x3"

The test testIsTerminatingForcedTermination can be removed from expectedFailure because it passes with the proposed PR.

The test testTerminationShouldProceedAllEnsureBlocksIfSomeWasFailed currently marked as expected failure can be removed because it refers to the UnwindError exception no longer used by the current version of #terminate.

A PR and tests will be submitted. Regards, Jaromir Matas

isCzech avatar Dec 12 '21 13:12 isCzech

Hi @isCzech

The test testTerminationShouldProceedAllEnsureBlocksIfSomeWasFailed currently marked as expected failure can be removed because it refers to the UnwindError exception no longer used by the current version of #terminate.

Actually I wanted to push the UnwindError logic back (while keeping your fixes). Now we lost any signaling about broken ensure blocks. I think it is critical issue. Try the following code:

p := [ [ 100 seconds wait ] ensure: [ 1/0 ] ] fork.
1 seconds wait.
p terminate.

It successfully completes. No way to know that we have a bug inside ensure part. Previously we would get UnwindError forked in background process. Notice that in the real live the ensure block may contain some critical "resource release" code. It should not fail silently.

dionisiydk avatar Dec 12 '21 14:12 dionisiydk

Nice example :) But when I run it a ZeroDivide exception is raised after 1 second - it appears it works as expected to me -?

isCzech avatar Dec 12 '21 15:12 isCzech

Actually I wanted to push the UnwindError logic back (while keeping your fixes). Now we lost any signaling about broken ensure blocks. I think it is critical issue. Try the following code:

p := [ [ 100 seconds wait ] ensure: [ 1/0 ] ] fork.
1 seconds wait.
p terminate.

yes, it was a major issue and is why I'm sending PR #10645 ; all nested ensure blocks are being correctly unwound now, including those containing non-local returns; those leading to the BCR error are caught and dealt with so that you can debug them freely and safely. Errors inside ensure blacks may be nested and all will be dealt with.

so no, at the moment I'm not aware of any situation an UnwindError may be necessary - I'd be very interested, though, to know if you find some border situations :) Thanks!

isCzech avatar Dec 12 '21 16:12 isCzech

Nice example :) But when I run it a ZeroDivide exception is raised after 1 second - it appears it works as expected to me -?

I use latest Pharo 10. And there is no exception after termination.

dionisiydk avatar Dec 12 '21 22:12 dionisiydk

There is no way to get an exception with current version of #doTerminationFromAnotherProcess. It evaluates unwind contexts using #runUntilErrorOrReturnFrom:. According the the name it runs the code until an error and nothing else - the error will not go to the unhandled route to get the debugger. You would need to analyze the result to detect the error and this is what the previous version of the method did where UnwindError was forked.

dionisiydk avatar Dec 12 '21 22:12 dionisiydk

yes, it was a major issue and is why I'm sending PR #10645 ; all nested ensure blocks are being correctly unwound now, including those containing non-local returns; those leading to the BCR error are caught and dealt with so that you can debug them freely and safely. Errors inside ensure blacks may be nested and all will be dealt with.

so no, at the moment I'm not aware of any situation an UnwindError may be necessary - I'd be very interested, though, to know if you find some border situations :) Thanks!

Right. With your PR my example fails. And nested failed ensures works nice.

dionisiydk avatar Dec 12 '21 22:12 dionisiydk

Hi @Ducasse , @dionisiydk , @estebanlm , @guillep Please review the latest commit 512bc5d and merge if agreed. All relevant tests are green.

After many iterations I've arrived at a code for #terminate with a very similar structure as the original elegant code in Squeak versions 1 through 3.5, i.e. before Pharo forked :) I admired the simplicity of the original code using Context>>#unwindTo: to deal with all unwinds. The enclosed version of unwindTo: fixes issues with non-local returns and extends the unwind semantics a bit (see comments in the methods).

The main highlights of the solution are:

  1. direct termination - the unwind blocks are always executed by the process being terminated (i.e. termination is process faithful)
  2. code structure follows (or copies) the structure used in the early Squeak implementation (versions 1 up to 3.5)
  3. non-local returns can be placed anywhere inside unwind blocks, even nested etc.

Compare the proposed with the original #terminate from 1999; enjoy :)

PS: for some reason when synchronizing my local repository the system pushed 118 commits to my PR #10645. Apologies for any inconvenience - I only work with Github from time to time and this was unexpected and unintended indeed.

isCzech avatar May 22 '22 18:05 isCzech

Hi @isCzech .

This code looks great. I have several questions. But I will wait once you will solve the merge problems. Good job!

dionisiydk avatar May 22 '22 21:05 dionisiydk

Hi @dionisiydk Thanks! Honestly, I don't know what went wrong with the synchronization... The only commit I intended to push was 512bc5d. The rest is "garbage" from synch going amok. Is it possible for you to just load commit 512bc5d? Otherwise please advise :) Best, Jaromir

isCzech avatar May 23 '22 05:05 isCzech

Hi @dionisiydk Thanks! Honestly, I don't know what went wrong with the synchronization... The only commit I intended to push was 512bc5d. The rest is "garbage" from synch going amok. Is it possible for you to just load commit 512bc5d? Otherwise please advise :) Best, Jaromir

See #11301 . I simply rebased your commit on top of Pharo11 branch. It keeps your authorship. But you can just repeat same for your PR. Or we can go with mine. You are boss :)

dionisiydk avatar May 26 '22 22:05 dionisiydk

See #11301 . I simply rebased your commit on top of Pharo11 branch. It keeps your authorship. But you can just repeat same for your PR. Or we can go with mine. You are boss :)

Thanks so much! I'd stay with yours if you don't mind. I have near zero experience with Github and don't want to do any more mess here :)

isCzech avatar May 26 '22 23:05 isCzech

Thanks denis! We are busy and will come back to it. This is the right moment.

Ducasse avatar May 27 '22 08:05 Ducasse