mlton
mlton copied to clipboard
Equivalent programs show different behaviour
I have an example where a finalizer is unexpectedly not being run. However, the finalizer is run if I transform the code to something (that I thought was) equivalent. This is using MLton 20210117.
A highly reduced example is attached which is built with the command make mlton
. The code contains a function declaration
fun theFunction ({n, ...} : t) = ...
and two references to the function of the form theFunction self
. When run, it produces the output
finalized a1
finalized obj1
so obj2
is not finalized. If the code is changed to declare
fun theFunction n =
and the two references to the function changed to theFunction (#n self)
, then it produces the output
finalized a1
finalized obj1
finalized obj2
so obj2
is finalized. Therefore, finalization of obj2
depends on whether the record component n
is selected by pattern matching in the function declaration or in the argument of each call.
The reason that finalization does not occur in the original example is probably due to my alternative implementation of FINALIZABLE
in the structure Finalizable
(which is based on my Poly/ML implementation). At exit, the finalizers are cleaned up by code registered with both MLtonSignal.handleGC
and OS.Process.atExit
. There is no code that empties the list of pending finalizers at exit as the in the MLton implementation. Instead, data races are avoided by using the module SharedVar
which protects access to a shared variable using MLton.Thread.atomically
. It can be seen that this protection has an effect for the list of pending finalizers in this example: in the original example, changing
fun foldmap r f x = MLton.Thread.atomically (fn () => foldmap' r f x)
to
fun foldmap r f x = foldmap' r f x
i.e. removing the protection from the module SharedVar
produces the output
finalized a1
finalized obj1
finalized a1
free(): double free detected in tcache 2
Aborted (core dumped)
Possibly I am misusing MLton.Thread.atomically
or have done something wrong. Still, I am surprised that two seemingly equivalent programs can give different results.
Using structure Finalizable = MLton.Finalizable
produces the output:
finalized obj2
finalized obj1
finalized a1
whether using theFunction self
or theFunction (#n self)
.
I'm not sure why using MLton.Thread.atomically
would eliminate the need for a loop in the atExit
handler. The purpose of MLton.Thread.atomically
is simply to prevent implicit switching to a signal handler thread. Looking at the code:
fun finalize doGC =
let
val () = doGC ()
val cleanedSome' = cleanPendingList marshalNow globalPendingList
in
cleanedSome'
end
local
val onGC = fn () => ignore (cleanPendingList marshalNow globalPendingList)
fun onExit () =
let
val () = while finalize MLton.GC.collect do ()
in
()
end
val () = MLtonSignal.handleGC onGC
val () = OS.Process.atExit onExit
in
end
there is no guarantee that all finalizers will be run. In particular, it admits the following sequence of actions:
- program concludes, call
onExit
handler. - call
finalize MLton.GC.collect
fromonExit
- call
MLton.GC.collect()
fromfinalize
- switch to GC signal handler, call
onGC
- call
cleanPendingList
fromonGC
-
cleanPendingList
returnstrue
(ran a finalizer); having removed that finalizer, another object (with a finalizer) becomes unreachable, but has not yet been collected -
onGC
returns - GC signal handler concludes, switch back to main thread
- resume after
MLton.GC.collect()
infinalize
- call
cleanPendingList
fromfinalize
-
cleanPendingList
returnsfalse
, because no GC has yet occurred to collect the object with a finalizer -
finalize
returnsfalse
-
while finalize MLton.GC.collect do ()
terminates -
onExit
returns - program terminates without running final finalizer
However, changing onExit
to:
fun onExit () =
let
(* val () = while finalize MLton.GC.collect do () *)
val localPendingList =
SharedVar.new (SharedVar.foldmap globalPendingList (fn ((), l) => (l, [])) ())
fun loop () =
let
val () = MLton.GC.collect ()
in
if cleanPendingList marshalNow localPendingList
then loop ()
else ()
end
in
loop ()
end
does not change the behavior of the program (no finalized obj2
message printed).
Using
structure Finalizable = MLton.Finalizable
produces the output:finalized obj2 finalized obj1 finalized a1
whether using
theFunction self
ortheFunction (#n self)
.
The original example does not use MLton.Finalizable
because it makes use of additional finalizable features. All use of the new features was eliminated when reducing the example. (I will raise issues to discuss these new features.)
I'm not sure why using
MLton.Thread.atomically
would eliminate the need for a loop in theatExit
handler.
There is a while-loop in the atExit
handler. Have I misunderstood your comment?
The purpose of
MLton.Thread.atomically
is simply to prevent implicit switching to a signal handler thread. Looking at the code:fun finalize doGC = let val () = doGC () val cleanedSome' = cleanPendingList marshalNow globalPendingList in cleanedSome' end local val onGC = fn () => ignore (cleanPendingList marshalNow globalPendingList) fun onExit () = let val () = while finalize MLton.GC.collect do () in () end val () = MLtonSignal.handleGC onGC val () = OS.Process.atExit onExit in end
there is no guarantee that all finalizers will be run. In particular, it admits the following sequence of actions:
* program concludes, call `onExit` handler. * call `finalize MLton.GC.collect` from `onExit` * call `MLton.GC.collect()` from `finalize` * switch to GC signal handler, call `onGC` * call `cleanPendingList` from `onGC` * `cleanPendingList` returns `true` (ran a finalizer); having removed that finalizer, another object (with a finalizer) becomes unreachable, but has not yet been collected * `onGC` returns * GC signal handler concludes, switch back to main thread * resume after `MLton.GC.collect()` in `finalize` * call `cleanPendingList` from `finalize` * `cleanPendingList` returns `false`, because no GC has yet occurred to collect the object with a finalizer * `finalize` returns `false` * `while finalize MLton.GC.collect do ()` terminates * `onExit` returns * program terminates without running final finalizer
Good point. The handler cannot rely on the result of finalize MLton.GC.collect
as you observe. By the way, this issue is much less likely in the original example where finalize inhibits one round of cleaning by the handleGC
handler to prevent double-cleaning. However, this is just for performance - to avoid excessive cleaning - and the issue may still occur.
However, changing
onExit
to:fun onExit () = let (* val () = while finalize MLton.GC.collect do () *) val localPendingList = SharedVar.new (SharedVar.foldmap globalPendingList (fn ((), l) => (l, [])) ()) fun loop () = let val () = MLton.GC.collect () in if cleanPendingList marshalNow localPendingList then loop () else () end in loop () end
I think this would lose finalizers as you observed in a message to MLton-user. It would be possible to apply a similar fix but this becomes quite a lot of code in the original example because there are 'finalizer contexts' which involves a set of pending lists, once for each context. I have tried a different approach, using a boolean notifier, to ensure that finalize
returns false only if no cleaning occurred between GC and the function returning. I think this makes the following code work:
while finalize MLton.GC.collect do ()
The updated example is attached.
does not change the behavior of the program (no
finalized obj2
message printed).
Also, the updated example does not change the behavior (no finalized obj2
message printed). It's strange.
I'm not sure why using
MLton.Thread.atomically
would eliminate the need for a loop in theatExit
handler.There is a while-loop in the
atExit
handler. Have I misunderstood your comment?
I was responding to the following from the original post:
There is no code that empties the list of pending finalizers at exit as the in the MLton implementation. Instead, data races are avoided by using the module SharedVar which protects access to a shared variable using
MLton.Thread.atomically
.
I misunderstood what was meant by "empties the list of pending finalizers". I took it to mean "the loop that keeps running finalizers", but I realize now that it means "setting r := []
".
However, changing
onExit
to:fun onExit () = let (* val () = while finalize MLton.GC.collect do () *) val localPendingList = SharedVar.new (SharedVar.foldmap globalPendingList (fn ((), l) => (l, [])) ()) fun loop () = let val () = MLton.GC.collect () in if cleanPendingList marshalNow localPendingList then loop () else () end in loop () end
I think this would lose finalizers as you observed in a message to MLton-user.
Agreed, though it shouldn't make a difference in this example, because no finalizers are created by finalizers.
Also, the updated example does not change the behavior (no
finalized obj2
message printed). It's strange.
I haven't looked at the updated example in detail, but agreed that is a strange behavior that no finalized obj2
message is printed.