raddebugger icon indicating copy to clipboard operation
raddebugger copied to clipboard

High CPU usage, target basically never runs

Open seanptmaher opened this issue 1 month ago • 8 comments

I have a very similar issue to #667 , but I am still hitting it after building @ 4c32238b382729c65dc57147882c7e9e7efa401e.

Repro is building chromium, and then launching it (usually with a file breakpoint). The program then basically never becomes interactive, even if I wait many minutes (~5). Windbg launches it ~instantly and it becomes interactive instantly.

Here's a screencap: https://drive.google.com/file/d/1AVb64m1gyCd-PQ2BTYvVkTtOuDIbuygp/view?usp=sharing

Here's my %appdata%\raddbg folder: https://drive.google.com/file/d/1Cfu2CIWFXnZwim3P2lXRvTmPBh91LbXz/view?usp=sharing

I took a xperf profile as well, with some screenshots attached showing a flamegraph of the CPU profile.

Image Image Image Image

Here's the ETL file (request access with an email that is somewhat legit and i'll share it pls - otherwise ETL files contain a lot of PII): https://drive.google.com/file/d/1-R3Geqz00ErYtWJaK5BePaUs1uw2MocX/view?usp=sharing

Here's the PDB file (if your symbol cache is C:\src\symbols (as mine is) then place it at c:\src\symbols\raddbg.pdb\151791EB98B64908AB3BE5B9472109549\raddbg.pdb

seanptmaher avatar Nov 05 '25 22:11 seanptmaher

I think I've at least partly solved the high CPU usage issue (see #681); I'm now investigating the Chromium performance while debugged.

I seem to have very different results in both RADDBG / WinDbg depending on whether or not child process debugging is enabled. Your video shows that you have child process debugging enabled in RADDBG (Debug Subprocesses in the target settings). In WinDbg, this is enabled via the .childdbg 1 command (which resets on each process launch - you need to set it while the initial process is stopped at ntdll!LdrpDoDebuggerBreak).

Are you comparing multi-process debugging performance in RADDBG with multi-process debugging in WinDbg? If not, if you disable child process debugging in RADDBG, do you notice the same issue?

EDIT: Some extra information... On my end, Chromium becomes interactive in far less than five minutes (although with all sub-processes attached it takes much longer in both RADDBG/WinDbg). But with just single process debugging, both are interactive in nearly the same amount of time. I do not know what your development machine is like compared to mine though, so it may or may not be indicative of a bug.

ryanfleury avatar Nov 07 '25 19:11 ryanfleury

I was comparing windbg with multi-process debugging, yes. It's not that powerful of a machine, it's a 7840u laptop + 64gb ram. Here's another video showing the comparison (not sure if it's helpful, because I recorded it before that last commit you mentioned in the other issue, but.. this was what it looked like before): https://drive.google.com/file/d/13PYU30ixraTyZCcTpICvbd_7u1aNqrrY/view?usp=sharing

I'll try again after pulling.

Edit: You can .childdbg 1 and sxi ibp via the command line with the -o and -g flags respectively. That's what i was mousing over at the start of the video.

seanptmaher avatar Nov 07 '25 21:11 seanptmaher

After pulling and relaunching, it took >5 minutes with a cold start (about ~1m of loading debug symbols, so ~4min from window launch) to get interactive. It honestly doesn't seem like much changed for this case with that commit. If you'd like, I can take another profile, if you think that'd be helpful.

seanptmaher avatar Nov 07 '25 21:11 seanptmaher

The debugger will not wait for any PDB -> RDI conversions that it deems are unnecessary for breakpoint resolution, so I don't think the symbol loading would make a huge difference (the debugger may report that it is actively loading symbols but that should not impact the debuggee performance, other than the obvious more contention for system usage).

We can test this by looking at either (a) what the performance is like without any breakpoints set, or (b) what the performance is like after the debugger has produced all RDI files needed for Chromium. Do you notice anything different in those two cases?

ryanfleury avatar Nov 07 '25 21:11 ryanfleury

FWIW, if I debug Chromium with "Debug Subprocesses" and have 2 breakpoints enabled I also experience a slow start of Chromium, slower than if the breakpoints were disabled, it's not as slow as 5 minutes but its a noticeable slowdown (about 23 seconds). Takes about 8 seconds normally. During the startup of Chromium raddbg also has a high cpu usage which settles down after Chromium has loaded. I feel like maybe some of this slowdown might be related to https://github.com/EpicGamesExt/raddebugger/issues/384 ? So @seanptmaher might also want to do a test with breakpoints enabled but "Debug Subprocesses" disabled.

mistymntncop avatar Nov 07 '25 22:11 mistymntncop

That would be surprising to me, because this case seems to have an extremely small number of traps to write... (unless this code is heavily inlined, which I didn't think it was). But I need to investigate #384 - that might reveal a bug or design flaw that is causing this problem.

ryanfleury avatar Nov 07 '25 22:11 ryanfleury

Indeed, launching without any breakpoints seems to launch much faster (i.e. basically instantly, like windbg).

The debugger will not wait for any PDB -> RDI conversions that it deems are unnecessary for breakpoint resolution, so I don't think the symbol loading would make a huge difference (the debugger may report that it is actively loading symbols but that should not impact the debuggee performance, other than the obvious more contention for system usage).

How does it determine what PDBs are necessary? This build of chrome is a 'component build', meaning that it has ~500 PDBs. I could easily see a world where it doesn't know which PDBs are actually required and spends longer than it should on it.

seanptmaher avatar Nov 10 '25 19:11 seanptmaher

Also, I'm not sure how to only trace 'after the debugger has produced all RDI files needed'; can you provide more info on what you mean, if the debugger isn't waiting for any RDI files that it doesn't think are necessary for breakpoint resolution.

Can I generate the RDI files as part of a build step easily? (that might actually just be preferable in general, honestly...)

EDIT: how the heck did you use radlink.exe to link chrome.dll? I haven't found a reasonable way to modify the linker that GN uses.

seanptmaher avatar Nov 10 '25 21:11 seanptmaher