gctoolkit
gctoolkit copied to clipboard
G1GC Concurrent Cleanup event
Some of the Concurrent times events are not coming: Concurrent Mark From Roots, Concurrent Preclean, Concurrent Rebuild RS.
We believe these should be coming through. Please attach a log so we can debug.
Hi @dsgrieve , first of all thank you very much for a quick response and checking all the issues.
Please find the log attached. As you can see the log does contain Concurrent Mark From Roots, Concurrent Preclean, Concurrent Rebuild Remembered Sets data, I am receiving ConcurrentScanRootRegion, ConcurrentClearClaimedMarks, ConcurrentCleanupForNextMark events, but i am not receiving G1ConcurrentMark and G1ConcurrentRebuildRememberedSets events
g1gc_stress.log
Can confirm this issue.
[29.167s][info][gc ] GC(8) Concurrent Mark Cycle
[29.167s][info][gc,marking ] GC(8) Concurrent Scan Root Regions
[29.171s][info][gc,marking ] GC(8) Concurrent Scan Root Regions 3.827ms
[29.171s][info][gc,marking ] GC(8) Concurrent Mark
[29.171s][info][gc,marking ] GC(8) Concurrent Mark From Roots
[29.171s][info][gc,task ] GC(8) Using 1 workers of 1 for marking
[29.219s][info][gc,marking ] GC(8) Concurrent Mark From Roots 47.708ms
[29.219s][info][gc,marking ] GC(8) Concurrent Preclean
[29.219s][info][gc,marking ] GC(8) Concurrent Preclean 0.090ms
[29.219s][info][gc,start ] GC(8) Pause Remark
**[29.223s][info][gc ] GC(8) Pause Remark 56M->56M(200M) 4.633ms
**[29.223s][info][gc,cpu ] GC(8) User=0.00s Sys=0.00s Real=0.00s
[29.224s][info][gc,marking ] GC(8) Concurrent Mark 52.821ms
[29.224s][info][gc,marking ] GC(8) Concurrent Rebuild Remembered Sets and Scrub Regions
[29.242s][info][gc,marking ] GC(8) Concurrent Rebuild Remembered Sets and Scrub Regions 17.954ms
[29.242s][info][gc,start ] GC(8) Pause Cleanup
[29.242s][info][gc ] GC(8) Pause Cleanup 58M->58M(200M) 0.083ms
**[29.242s][info][gc,cpu ] GC(8) User=0.00s Sys=0.00s Real=0.00s
[29.242s][info][gc,marking ] GC(8) Concurrent Clear Claimed Marks
[29.242s][info][gc,marking ] GC(8) Concurrent Clear Claimed Marks 0.040ms
[29.242s][info][gc,marking ] GC(8) Concurrent Cleanup for Next Mark
[29.243s][info][gc,marking ] GC(8) Concurrent Cleanup for Next Mark 1.371ms
[29.243s][info][gc ] GC(8) Concurrent Mark Cycle 76.439ms
Suspect the issue is that the G1 parser publishes the event before it should, which removes the forwardReference.
I've annotated a couple of lines with ** where an event publish is attempted. The final line does not attempt a publish, but rather just removes the forward reference.
I guess the first question is, is it OK to have multiple Events published from the same GC ID? We should know that we're in a Concurrent Cycle, and we could re-populate the forward reference after each publish. This would probably be the simplest solution.
Alternatively, allowing a hierarchy of events may be useful in the long term for events involving phases, but might require some refactoring.
Thoughts?
I think I've got a fix for this.
The underlying issue appears to be that the forward reference gets removed while we're in the context of a Concurrent Mark event.
In order to fully support the above example from Java 21 that I provided, we may need either a new event type, or to repurpose the existing G1RebuildRememberedSets class.
@dsgrieve @kcpeppe: Would you prefer if I reused the G1RebuildRememberedSets event for "Concurrent Rebuild Remembered Sets and Scrub Regions", or create a new event type when I fix this? Not sure if it's distinct, or just a rename.
As usual, not as easy as I initially thought.
In Java 17, 21, it appears that the Pause Remark event is a "child" event contained within a Concurrent Mark event:
[2025-04-24T13:20:23.110+0000][info][gc,marking ] GC(13) Concurrent Mark
[2025-04-24T13:20:23.110+0000][info][gc,marking ] GC(13) Concurrent Mark From Roots
[2025-04-24T13:20:23.110+0000][info][gc,task ] GC(13) Using 1 workers of 1 for marking
[2025-04-24T13:20:23.173+0000][info][gc,marking ] GC(13) Concurrent Mark From Roots 61.036ms
[2025-04-24T13:20:23.173+0000][info][gc,marking ] GC(13) Concurrent Preclean
[2025-04-24T13:20:23.173+0000][info][gc,marking ] GC(13) Concurrent Preclean 0.160ms
[2025-04-24T13:20:23.173+0000][info][gc,start ] GC(13) Pause Remark
[2025-04-24T13:20:23.189+0000][info][gc ] GC(13) Pause Remark 46M->45M(2048M) 8.228ms
[2025-04-24T13:20:23.189+0000][info][gc,cpu ] GC(13) User=0.03s Sys=0.00s Real=0.02s
[2025-04-24T13:20:23.189+0000][info][gc,marking ] GC(13) Concurrent Mark 70.120ms
Whereas, no such nesting occurs in Java 11:
[18.047s][info][gc,marking ] GC(14) Concurrent Mark (18.047s)
[18.047s][info][gc,marking ] GC(14) Concurrent Mark From Roots
[18.047s][info][gc,task ] GC(14) Using 1 workers of 1 for marking
[18.094s][info][gc,marking ] GC(14) Concurrent Mark From Roots 47.630ms
[18.094s][info][gc,marking ] GC(14) Concurrent Preclean
[18.095s][info][gc,marking ] GC(14) Concurrent Preclean 0.133ms
[18.095s][info][gc,marking ] GC(14) Concurrent Mark (18.047s, 18.095s) 47.822ms
[18.095s][info][gc,start ] GC(14) Pause Remark
[18.104s][info][gc,stringtable] GC(14) Cleaned string and symbol table, strings: 23428 processed, 123 removed, symbols: 116686 processed, 33 removed
[18.104s][info][gc ] GC(14) Pause Remark 53M->52M(2048M) 9.570ms
[18.104s][info][gc,cpu ] GC(14) User=0.06s Sys=0.00s Real=0.02s
So changes made to support Java17, 21 end up breaking test cases against Java 11 test files.
I'll have to think on it a bit further.