gctoolkit icon indicating copy to clipboard operation
gctoolkit copied to clipboard

G1GC Concurrent Cleanup event

Open AlinaHovakimyan opened this issue 1 year ago • 5 comments
trafficstars

Some of the Concurrent times events are not coming: Concurrent Mark From Roots, Concurrent Preclean, Concurrent Rebuild RS.

AlinaHovakimyan avatar Sep 02 '24 13:09 AlinaHovakimyan

We believe these should be coming through. Please attach a log so we can debug.

dsgrieve avatar Sep 04 '24 20:09 dsgrieve

Hi @dsgrieve , first of all thank you very much for a quick response and checking all the issues. Please find the log attached. As you can see the log does contain Concurrent Mark From Roots, Concurrent Preclean, Concurrent Rebuild Remembered Sets data, I am receiving ConcurrentScanRootRegion, ConcurrentClearClaimedMarks, ConcurrentCleanupForNextMark events, but i am not receiving G1ConcurrentMark and G1ConcurrentRebuildRememberedSets events g1gc_stress.log Screenshot 2024-09-12 at 11 01 58

AlinaHovakimyan avatar Sep 12 '24 07:09 AlinaHovakimyan

Can confirm this issue.

[29.167s][info][gc          ] GC(8) Concurrent Mark Cycle
[29.167s][info][gc,marking  ] GC(8) Concurrent Scan Root Regions
[29.171s][info][gc,marking  ] GC(8) Concurrent Scan Root Regions 3.827ms
[29.171s][info][gc,marking  ] GC(8) Concurrent Mark
[29.171s][info][gc,marking  ] GC(8) Concurrent Mark From Roots
[29.171s][info][gc,task     ] GC(8) Using 1 workers of 1 for marking
[29.219s][info][gc,marking  ] GC(8) Concurrent Mark From Roots 47.708ms
[29.219s][info][gc,marking  ] GC(8) Concurrent Preclean
[29.219s][info][gc,marking  ] GC(8) Concurrent Preclean 0.090ms
[29.219s][info][gc,start    ] GC(8) Pause Remark
**[29.223s][info][gc          ] GC(8) Pause Remark 56M->56M(200M) 4.633ms
**[29.223s][info][gc,cpu      ] GC(8) User=0.00s Sys=0.00s Real=0.00s
[29.224s][info][gc,marking  ] GC(8) Concurrent Mark 52.821ms
[29.224s][info][gc,marking  ] GC(8) Concurrent Rebuild Remembered Sets and Scrub Regions
[29.242s][info][gc,marking  ] GC(8) Concurrent Rebuild Remembered Sets and Scrub Regions 17.954ms
[29.242s][info][gc,start    ] GC(8) Pause Cleanup
[29.242s][info][gc          ] GC(8) Pause Cleanup 58M->58M(200M) 0.083ms
**[29.242s][info][gc,cpu      ] GC(8) User=0.00s Sys=0.00s Real=0.00s
[29.242s][info][gc,marking  ] GC(8) Concurrent Clear Claimed Marks
[29.242s][info][gc,marking  ] GC(8) Concurrent Clear Claimed Marks 0.040ms
[29.242s][info][gc,marking  ] GC(8) Concurrent Cleanup for Next Mark
[29.243s][info][gc,marking  ] GC(8) Concurrent Cleanup for Next Mark 1.371ms
[29.243s][info][gc          ] GC(8) Concurrent Mark Cycle 76.439ms

Suspect the issue is that the G1 parser publishes the event before it should, which removes the forwardReference.

I've annotated a couple of lines with ** where an event publish is attempted. The final line does not attempt a publish, but rather just removes the forward reference.

I guess the first question is, is it OK to have multiple Events published from the same GC ID? We should know that we're in a Concurrent Cycle, and we could re-populate the forward reference after each publish. This would probably be the simplest solution.

Alternatively, allowing a hierarchy of events may be useful in the long term for events involving phases, but might require some refactoring.

Thoughts?

jlittle-ptc avatar Apr 23 '25 15:04 jlittle-ptc

I think I've got a fix for this.

The underlying issue appears to be that the forward reference gets removed while we're in the context of a Concurrent Mark event.

In order to fully support the above example from Java 21 that I provided, we may need either a new event type, or to repurpose the existing G1RebuildRememberedSets class.

@dsgrieve @kcpeppe: Would you prefer if I reused the G1RebuildRememberedSets event for "Concurrent Rebuild Remembered Sets and Scrub Regions", or create a new event type when I fix this? Not sure if it's distinct, or just a rename.

jlittle-ptc avatar Apr 29 '25 14:04 jlittle-ptc

As usual, not as easy as I initially thought.

In Java 17, 21, it appears that the Pause Remark event is a "child" event contained within a Concurrent Mark event:

[2025-04-24T13:20:23.110+0000][info][gc,marking  ] GC(13) Concurrent Mark
[2025-04-24T13:20:23.110+0000][info][gc,marking  ] GC(13) Concurrent Mark From Roots
[2025-04-24T13:20:23.110+0000][info][gc,task     ] GC(13) Using 1 workers of 1 for marking
[2025-04-24T13:20:23.173+0000][info][gc,marking  ] GC(13) Concurrent Mark From Roots 61.036ms
[2025-04-24T13:20:23.173+0000][info][gc,marking  ] GC(13) Concurrent Preclean
[2025-04-24T13:20:23.173+0000][info][gc,marking  ] GC(13) Concurrent Preclean 0.160ms
[2025-04-24T13:20:23.173+0000][info][gc,start    ] GC(13) Pause Remark
[2025-04-24T13:20:23.189+0000][info][gc          ] GC(13) Pause Remark 46M->45M(2048M) 8.228ms
[2025-04-24T13:20:23.189+0000][info][gc,cpu      ] GC(13) User=0.03s Sys=0.00s Real=0.02s
[2025-04-24T13:20:23.189+0000][info][gc,marking  ] GC(13) Concurrent Mark 70.120ms

Whereas, no such nesting occurs in Java 11:

[18.047s][info][gc,marking    ] GC(14) Concurrent Mark (18.047s)
[18.047s][info][gc,marking    ] GC(14) Concurrent Mark From Roots
[18.047s][info][gc,task       ] GC(14) Using 1 workers of 1 for marking
[18.094s][info][gc,marking    ] GC(14) Concurrent Mark From Roots 47.630ms
[18.094s][info][gc,marking    ] GC(14) Concurrent Preclean
[18.095s][info][gc,marking    ] GC(14) Concurrent Preclean 0.133ms
[18.095s][info][gc,marking    ] GC(14) Concurrent Mark (18.047s, 18.095s) 47.822ms

[18.095s][info][gc,start      ] GC(14) Pause Remark
[18.104s][info][gc,stringtable] GC(14) Cleaned string and symbol table, strings: 23428 processed, 123 removed, symbols: 116686 processed, 33 removed
[18.104s][info][gc            ] GC(14) Pause Remark 53M->52M(2048M) 9.570ms
[18.104s][info][gc,cpu        ] GC(14) User=0.06s Sys=0.00s Real=0.02s

So changes made to support Java17, 21 end up breaking test cases against Java 11 test files.

I'll have to think on it a bit further.

jlittle-ptc avatar Apr 29 '25 15:04 jlittle-ptc