ChezScheme
ChezScheme copied to clipboard
Log GC pause times when collect-notify is active
I got curious about GC pause times, so I changed the logging.
a6le/mats/summary
looks clean with this change.
I like this idea. If you feel like it, you might add bytes reclaimed and streamline the message so the message reads something like this:
[collecting generation 1 into 2...reclaimed 3.9M of 1.5G in 22.247ms]
where before and after bytes are computed using bytes-allocated for just the copied generations. I would include just CPU time.
NB. The calculation of diff-us should round rather than truncate. You could us, e.g., (round (/ (time-nanosecond diff) 1000))
or (div (+ (time-nanosecond diff) 500) 1000)
. The first rounds halfway amounts to even; the second rounds up. I think for this application I prefer rounding up.
Do you mean (div (+ (time-nanosecond diff) 999) 1000)
for rounding up?
If we want just one time, I recommend using (monotonic) real time, not CPU time, because that reflects the GC pause the application sees. CPU time can be much less (paging, descheduling).
Is there already a MB/GB formatting function I can reuse? On the other hand, people may want to process this by scripts, so using MB unconditionally may be better.
I actually meant 500. 999 would implement ceiling rather than round.
Using monotonic time and MB is fine with me.
Output now looks like this:
[collecting generation 1 into generation 2...1155.791M->1150.998M in 2.082ms]
[collecting generation 0 into generation 1...1159.033M->1154.305M in 0.301ms]
[collecting generation 0 into generation 1...1162.332M->1157.612M in 0.304ms]
[collecting generation 0 into generation 1...1165.639M->1160.919M in 0.299ms]
[collecting generation 4 into generation 4...1168.947M->319.112M in 153.469ms]
[collecting generation 0 into generation 1...327.144M->322.417M in 0.399ms]
[collecting generation 0 into generation 1...330.449M->325.724M in 0.414ms]
I switched to floating point arithmetic, hoping that's okay.
I think the before/after heap sizes are more informative than the amount of reclaimed memory. (It's what other GCs prints.) It also sidesteps the issue whether the reclaimed memory can go negative.
I'm going to add a LOG
entry once the NOP change has been merged.
I'm not enthusiastic about using floating-point arithmetic for such things because I have to think about whether it can cause problems. In this case I don't think so, but it would be safer to stick with exact arithmetic. You can still use "~,3f"
to format an exact rational number, though I think it would be better to use at most as many digits to the right of the decimal point as necessary to get three significant places. If this seems ugly, I would go back to my original idea of using K, M, G, and T as appropriate and let output-processing scripts deal with it.
More importantly, I think it would be more informative to see how many bytes of the bytes actually subject to collection (i.e., those in the collected generations) were reclaimed (or retained). Getting that is a bit more involved than I described earlier. To compute the bytes subject to collection, I think you'll need to sum the bytes allocated in the collected generations before calling do-gc
, say via (do ([i 0 (fx+ i 1)] [n 0 (+ n (bytes-allocated i))]) ((fx> i g) n))
. To compute the bytes reclaimed you can use the existing logic, i.e., subtract (bytes-allocated)
before do-gc
from (bytes-allocated)
after, since any reclaimed bytes have to have come from the collected generations.
Following up on Kent's last message, are you still interesting in moving forward with this PR?
@fweimer, I'm going to close this pull request for now. You're welcome to post a follow-up message to this request if you'd like to continue to work on it.