Implement weak reference processing and finalisation in mmtk-openjdk

Open wks opened this issue 2 years ago • 1 comments

Parent issue: https://github.com/mmtk/mmtk-core/issues/694

Previously, mmtk-core includes an implementation of weak reference processor and a finaliser processor from JikesRVM. mmtk-openjdk, the OpenJDK binding, also used that framework. That approach is problematic. (See https://github.com/mmtk/mmtk-core/issues/694)

With the generalised language-neutral weak reference processing API introduced in mmtk-core (See: https://github.com/mmtk/mmtk-core/pull/700), we should implement the reference/finalisation processors in mmtk-openjdk in an OpenJDK-specific manner.

Related work:

https://github.com/wks/mmtk-openjdk/tree/gen-weakref-api It was my experiment. I copied and pasted the ReferenceProcessor and FinalizableProcessor to mmtk-openjdk with minimal modification, and use the new API to communicate with mmtk-core. This shows the new API works, but the ReferenceProcessor and FinalizableProcessor are still implemented in the JikesRVM style.
https://github.com/wenyuzhao/mmtk-openjdk/tree/lxr This branch is mainly for the LXR GC algorithm, but it also includes a new implementation of reference/finalisation processor that is more closely integrated with OpenJDK. However, this implementation depends on a modified version of mmtk-core that contains an extended but still Java-specific weak reference processing API.

We should implement reference/finalisation processing in mmtk-openjdk in a way closely related to OpenJDK's own reference/finalisation processing mechanism, but we should use the new language-neutral API.

Key to using the new API is handling the multiple strengths of references. By using the return value of the Scanning::process_weak_refs function, the VM binding can build a state machine to handle soft/weak/final/phantom references while expanding the transitive closure multiple times. (See: https://github.com/wks/mmtk-openjdk/blob/gen-weakref-api/mmtk/src/weak_processor/mod.rs)

Mar 18 '23 15:03 wks

After reading the source code of OpenJDK, I think we don't need to exactly copy what OpenJDK is doing. For example, we may organize discovered references in vectors instead of linked lists if that is more friendly to work packets and parallel processing. We may still implement things in Rust and do things in the same principle as the existing C++ code.

I am not sure about how much code can be reused for JikesRVM. One fundamental difference, as mentioned in https://github.com/mmtk/mmtk-openjdk/issues/184#issuecomment-1281834096, is that OpenJDK discovers Reference instances during transitive closure, while JikesRVM registers Reference instances on creation.

Aug 19 '25 02:08 wks