openj9
openj9 copied to clipboard
Redesign of Locked Synchronizer processing
JVM is supposed to respond to API where the list of threads and their Locked (Ownable) Synchronizes are provided: https://docs.oracle.com/javase/7/docs/api/java/lang/management/ThreadMXBean.html#dumpAllThreads(boolean,%20boolean) https://docs.oracle.com/javase/7/docs/api/java/lang/management/ThreadMXBean.html#getThreadInfo(long[],%20int)
We currently maintain the list of all live Synchronziers
- every time we create an instance of such object we add it to the list (JIT is supposed to invoke a callback that will enqueue the object)
- at STW GCs and in Concurrent Scavenger we reset the list content and rebuild it. We recognize type of objects while traversing object graph, and at the point we add it to the new list.
- for Concurrent Marking GC we use 'weak root' approach. We don't reset the list at GC start, but we prune the existing list at the end of GC cycle. We iterate the list and remove all those that did not survive.
The upside of this approach is that the list is readily available (except for CS case, if invoked in a middle of a cycle, when we are forced to finish the cycle first, before returning), but:
- the code is somewhat complex, spread out in 10s of files, and occasionally we get confusing bugs due to unusual list organization (without going into details why, it does not terminate with null, but with a value pointing back to the last item in the list)
- for Concurrent Marking case, the prunning pass can take a few milisec (an example provided below with ~9ms)
<gc-op id="333" type="rs-scan" timems="31.092" contextid="327" timestamp="2022-06-01T11:25:40.746">
<scan objectsFound="680386" bytesTraced="34932988" workStackOverflowCount="0" />
</gc-op>
<gc-op id="334" type="card-cleaning" timems="194.066" contextid="327" timestamp="2022-06-01T11:25:40.941">
<card-cleaning cardsCleaned="4763" bytesTraced="403140304" workStackOverflowCount="0" />
</gc-op>
<gc-op id="335" type="mark" timems="33.611" contextid="327" timestamp="2022-06-01T11:25:40.974">
<trace-info objectcount="323706" scancount="323667" scanbytes="7506328" />
<finalization candidates="150" enqueued="0" />
<ownableSynchronizers candidates="655898" cleared="67876" />
<references type="soft" candidates="722083" cleared="0" enqueued="0" dynamicThreshold="24" maxThreshold="32" />
<references type="weak" candidates="4846" cleared="283" enqueued="0" />
<references type="phantom" candidates="132" cleared="0" enqueued="0" />
<stringconstants candidates="11092" cleared="0" />
<object-monitors candidates="23007" cleared="22879" />
</gc-op>
<gc-op id="336" type="sweep" timems="27.495" contextid="327" timestamp="2022-06-01T11:25:41.002" />
<gc-end id="337" type="global" contextid="327" durationms="287.909" usertimems="2120.607" systemtimems="13.495" stalltimems="27.892" timestamp="2022-06-01T11:25:41.003" activeThreads="8">
<mem-info id="338" free="12353736568" total="15032385536" percent="82">
<mem type="nursery" free="10761118744" total="12884901888" percent="83">
<mem type="allocate" free="10761118744" total="11596398592" percent="92" />
<mem type="survivor" free="0" total="1288503296" percent="0" />
</mem>
<mem type="tenure" free="1592617824" total="2147483648" percent="74" micro-fragmented="55117290" macro-fragmented="13926464">
<mem type="soa" free="1485243232" total="2040109056" percent="72" />
<mem type="loa" free="107374592" total="107374592" percent="100" />
</mem>
<remembered-set count="211890" />
</mem-info>
<scan timestamp="Jun 01 11:25:41 2022">
<thread id="0" classes="0.419" threads="0.542" unfinalizedobjects="0.010" ownablesynchronizerobjects="9.058" stringtable="0.090" weakrefs="0.975" softrefs="0.001" phantomrefs="9.101" jvmtiobjecttagtables="0.068" unfinalizedobjectscomplete="0.084" ownablesynchronizerobjectscomplete="0.138" monitorlookupcachescomplete="0.068" doubleMappedObjects="0.001"/>
<thread id="1" classes="0.419" threads="0.538" unfinalizedobjects="0.008" ownablesynchronizerobjects="8.361" stringtable="0.164" weakrefs="0.955" softrefs="0.060" phantomrefs="8.777" jvmtiobjecttagtables="0.054" noncollectableobjects="0.013" unfinalizedobjectscomplete="0.105" ownablesynchronizerobjectscomplete="0.135" monitorlookupcachescomplete="0.062" doubleMappedObjects="0.001"/>
<thread id="2" classes="0.433" threads="0.538" unfinalizedobjects="0.019" ownablesynchronizerobjects="9.128" stringtable="0.001" weakrefs="1.441" softrefs="0.001" phantomrefs="10.544" jvmtiobjecttagtables="0.068" unfinalizedobjectscomplete="0.089" ownablesynchronizerobjectscomplete="0.144" monitorlookupcachescomplete="0.065" doubleMappedObjects="0.001"/>
<thread id="3" classes="0.394" classloaders="0.030" threads="0.533" unfinalizedobjects="0.002" ownablesynchronizerobjects="8.747" stringtable="0.165" jniweakglobalrefs="0.006" weakrefs="0.944" softrefs="0.057" phantomrefs="8.777" jvmtiobjecttagtables="0.033" unfinalizedobjectscomplete="0.118" ownablesynchronizerobjectscomplete="0.153" monitorlookupcachescomplete="0.061" doubleMappedObjects="0.001"/>
<thread id="4" classes="0.411" threads="0.536" finalizableobjects="0.001" unfinalizedobjects="0.002" ownablesynchronizerobjects="8.466" stringtable="0.083" jniglobalrefs="0.008" weakrefs="1.011" softrefs="0.089" phantomrefs="8.768" jvmtiobjecttagtables="0.057" unfinalizedobjectscomplete="0.105" ownablesynchronizerobjectscomplete="0.136" monitorlookupcachescomplete="0.066" doubleMappedObjects="0.001"/>
<thread id="5" classes="0.425" threads="0.533" unfinalizedobjects="0.003" ownablesynchronizerobjects="8.963" stringtable="0.001" weakrefs="1.149" softrefs="0.088" phantomrefs="8.740" jvmtiobjecttagtables="0.040" unfinalizedobjectscomplete="0.104" ownablesynchronizerobjectscomplete="0.135" monitorlookupcachescomplete="0.069" doubleMappedObjects="0.001"/>
<thread id="6" classes="0.415" threads="0.538" unfinalizedobjects="0.007" ownablesynchronizerobjects="8.384" stringtable="0.079" weakrefs="1.011" softrefs="0.090" phantomrefs="8.727" jvmtiobjecttagtables="0.066" memoryareaobjects="7.015" unfinalizedobjectscomplete="0.092" ownablesynchronizerobjectscomplete="0.119" monitorlookupcachescomplete="0.062" doubleMappedObjects="0.001"/>
<thread id="7" classes="0.422" threads="0.534" unfinalizedobjects="0.003" ownablesynchronizerobjects="8.427" stringtable="0.064" weakrefs="1.035" softrefs="0.001" phantomrefs="8.902" jvmtiobjecttagtables="0.063" unfinalizedobjectscomplete="0.089" ownablesynchronizerobjectscomplete="0.145" monitorlookupcachescomplete="0.074" doubleMappedObjects="0.001"/>
<total classes="3.338" classloaders="0.030" threads="4.292" finalizableobjects="0.001" unfinalizedobjects="0.054" ownablesynchronizerobjects="69.534" stringtable="0.647" jniglobalrefs="0.008" jniweakglobalrefs="0.006" weakrefs="8.521" softrefs="0.387" phantomrefs="72.336" jvmtiobjecttagtables="0.449" noncollectableobjects="0.013" memoryareaobjects="7.015" unfinalizedobjectscomplete="0.786" ownablesynchronizerobjectscomplete="1.105" monitorlookupcachescomplete="0.527" doubleMappedObjects="0.008"/>
</scan>
</gc-end>
<cycle-end id="339" type="global" contextid="327" timestamp="2022-06-01T11:25:41.013" />
<exclusive-end id="340" timestamp="2022-06-01T11:25:41.013" durationms="300.598" />
The new approach would be passive - on demand. We would walk the heap and recognize such objects and add them to the list. While believe that walking the heap at random point of time should not be an issue (class pointer even for unloaded classes is valid for some time, until we kill class segments, but if that occurs, we convert dead objects to holes), we realize that the content of dead objects found during the walk is stale.
Since the API requires to associate Synchornizers with owning threads, we have to match the owner thread info from the synchronizer object with the list of currently live threads. In unlikely case that the object is dead and happened to have a dead thread owner matching a current live thread with the same id, we would incorrectly report a dead object.
So, to be just completely safe, we will rebuild the liveness information by performing GC, before actually doing the heap walk. Probably the simplest approach is to do it the same way how we deal with JVMTI APIs - we envoke ensureHeapWalkable, what not only rebuilds liveness info, but also removes all dead objects (makes holes out of them).
While this approach will remove the downsides of current approach, clearly it will be way slower to respond to the API. The documentation however warns that APIs could be slow, and not given guidelines how quickly we should respond:
This method is designed for troubleshooting use, but not for synchronization control. It might be an expensive operation.
@DanHeidinga @pshipton @vijaysun-omr
New implementation https://github.com/eclipse-openj9/openj9/pull/15173
As is stands now, the PR effectively removes 2K lines of code ( +472 −2,472). Some more is to be removed (JIT callback) at a later point.
@0xdaryl @zl-wang @joransiu @r30shah for your awareness.
Does this change have any bearing on the "live monitor metadata" that the JIT emits in order for some user level API to return the set of locked monitor objects ? I ask because that is a complex piece of code in the JIT and if this change affects it (e.g. allows us to remove it), that would be good for the JIT team to know.
https://github.com/eclipse/omr/pull/6633 should be reverted after this work is completed.