mmtk-core
mmtk-core copied to clipboard
Unclear contract of `flush_mutator`
Currently, the doc comments of flush_mutator
and destroy_mutator
say:
-
flush_mutator()
: Flush the mutator's local states -
destroy_mutator()
: Report to MMTk that a mutator is no longer needed. A binding should not attempt to use the mutator after this call...
Understandably, after destroy_mutator()
is called, the VM binding must not call any other method of Mutator
, and may reclaim its memory. This is relatively clear. The VM binding should call this when a thread exits.
But the contract of flush_mutator()
is unclear. It is unclear what "flush" and "local states" mean in the context of "flush the mutator's local states".
Currently,
-
mutator.flush()
callsmutator.flush_remembered_sets()
, and -
mutator.on_destroy()
callsallocator.on_mutator_destroy()
for all allocators. Currently only mark-sweep spaces need this. They return blocks to the global pool.
Obviously the allocators are part of a mutator's "local states", too, but they are not flushed during flush_mutator
. This is confusing.
What does MMTK core do?
mutator.flush()
is only called in the ScanMutatorRoots
work packet. And the only BarrierSemantics
that implements BarrierSemantics::flush
is GenObjectBarrierSemantics
. It flushes the modbuf by creating ProcessModBuf
work packets in the Closure
bucket.
If a mutator lives until the end of the program, this call site should suffice. The problem is, if a mutator thread exits between two GCs, it will record something in its modbuf, and reserve some blocks, but the Mutator
instance will not be enumerated by ActivePlan::mutators()
in the next GC. Those "dead" mutators must be reported to MMTk core before the next GC. This is the reason why we need to expose some form of mutator-flushing operation to the VM binidng.
What does the OpenJDK binding do?
The OpenJDK binding
- calls
flush_mutator()
inMMTkBarrierSet::on_thread_attach()
- calls
flush_mutator()
inMMTkBarrierSet::on_thread_detach()
- calls both
flush_mutator()
anddestroy_mutator()
inMMTkBarrierSet::on_thread_destroy()
This is confusing, too, for several reasons.
- After a thread is detached, the thread should not do anything to OpenJDK's heap. So even if the
Mutator
is reused after a subsequent "attach thread" operation (which is not. See below.), theflush_mutator
inon_thread_attach
should be redundant. - And
jni_DetachCurrentThread()
always callsThread::exit()
(which callson_thread_detach()
),and then destructs theThread
instance (which callson_thread_destroy()
). This means there is no such thing as merely "detaching" a mutator. TheThread
is always destroyed when detaching a thread in OpenJDK. This means when attaching thread the next time, it is attaching the current POSIX thread to a differentThread
instance which is created inattach_current_thread()
.- Similarly, when attaching, a new
Thread
instance is created, and thenThread::initialize_thread_current
is called (whch callsbind_mutator
), and thenThreads::add
is called (which callson_thread_attached()
which callsflush_mutator
). Obviously theflush_mutator
is redundant because the mutator is just created.
- Similarly, when attaching, a new
- And OpenJDK always calls
Thread::exit()
before destructing theThread
instance everywhere. This makes the flushing inon_thread_detroy
redundant.
Conclusion
A VM binding really wants to communicate only three things with the MMTK core.
- A mutator thread is created. The VM binding requests a
Mutator
instance to be created. (bind_mutator
) - When GC happens, the VM binding reports to the MMTk core all live
Mutator
instances. (ActivePlan::mutators()
) - When a mutator exits, the VM binding reports to the MMTk core that it is no loger used. (
destroy_mutator
)
The memory_manager::flush_mutator()
function has no semantics w.r.t. the VM binding. It should be removed. The API should only expose bind_mutator()
and destroy_mutator()
to the VM binding, and destroy_mutator
should do all the flushing necessary, including the remembered set and the blocks reserved by a mutator.
OpenJDK should call bind_mutator
when attaching a thread, and call destroy_mutator
when detaching a thread.
The name destroy_mutator
It should be better to rename the function destroy_mutator
because it does not deallocate the memory the mutator occupies (which is the obligation of the binding). Alternatives include:
-
unbind_mutator
: This pairs with thebind_mutator
function, and doesn't imply deallocating the memory. -
abandon_mutator
: Implies the mutator will not be used again, but doesn't imply deallocation, either. - Synonyms of "abandon": desert, discard, ditch, drop, forsake, scrap, ...
You need to still expose flush_mutator
(or whatever we end up calling it) in the case where the ScanStackRoots
work-packet is not created by the binding for VMs where it is too difficult/annoying to scan stack roots individually with all the root scanning done in scan_vm_specific_roots
(I honestly think that flushing should be a separate step).
You need to still expose
flush_mutator
(or whatever we end up calling it) in the case where theScanStackRoots
work-packet is not created by the binding for VMs where it is too difficult/annoying to scan stack roots individually with all the root scanning done inscan_vm_specific_roots
(I honestly think that flushing should be a separate step).
I think we currently require bindings to use the callback in stop_all_mutators
for each mutator. As long as the binding calls the callback, we create ScanStackRoots
for the mutator which includes flushing the states. In this case, we do not need a public flush_mutator
method.
https://github.com/mmtk/mmtk-core/blob/e55f872a52475daa8c5d49b9eb49a7588018dca4/src/vm/collection.rs#L19-L20