diagnostics icon indicating copy to clipboard operation
diagnostics copied to clipboard

Updated gcdecoder from main

Open VSadov opened this issue 8 months ago • 2 comments
trafficstars

Fixes: https://github.com/dotnet/diagnostics/issues/4795

Makes the gcdecoder in sync with the 10.0 runtime.

UNDONE: support for pre-10.0 runtimes (9.X and 8.X basically)

VSadov avatar Mar 24 '25 16:03 VSadov

Prior to this change doing !gcinfo for System.Threading.ManualResetEventSlim.Wait on net8.0 runtime produces:

0:000> !gcinfo 00007fff2906893e
entry point 00007FFF29068780
preJIT generated code
GC info 00007FFF28D8C778
Pointer table:
Prolog size: 0
GS cookie: <none>
PSPSym: initial.sp+30
Generics inst context: <none>
PSP slot: caller.sp+30
GenericInst slot: <none>
Varargs: 0
Frame pointer: rbp
Wants Report Only Leaf: 1
Size of parameter area: 30
Return Kind: Scalar                                         <-- correctly reports return kind
Code size: 460
Untracked: +rbp+20 +rbp-48
00000035 interruptible
00000035 +rbp+10 +rcx
0000004d +rax
00000057 -rax
00000069 -rcx
000000a4 +rcx
000000be -rcx
000000c2 +rcx
000000c8 -rcx
000000ce +rax(interior)
000000d5 +r8
. . .   and so on. Long list of liveness changes.

Same with net10.0

:000> !gcinfo 00007ffeb9c01302
entry point 00007FFEB9C01130
preJIT generated code
GC info 00007FFEB989684C
Pointer table:
Prolog size: 0
GS cookie: <none>
PSPSym: initial.sp+200
Generics inst context: <none>
PSP slot: caller.sp+200
GenericInst slot: <none>
Varargs: 0
Frame pointer: r13
Wants Report Only Leaf: 1
Size of parameter area: 6160
Return Kind: {ByRef, Object}         <-   10.0 no longer includes Return Kind in GC Info. This is bogus. And most other info too, since the return kind was right after header flags.
Code size: 93
c0000005 Exception in C:\Users\<userName>\.dotnet\sos\sos.dll.gcinfo debugger extension.
      PC: 00007ffe`b9403065  VA: 0000028d`2c14de28  R/W: 0  Parameter: 00000000`00000000

After this PR with net10.0

0:000> !gcinfo 00007ffeb9d51302
entry point 00007FFEB9D51130
preJIT generated code
GC info 00007FFEB99E684C
Pointer table:
Prolog size: 0
GS cookie: <none>
PSPSym: initial.sp+30
Generics inst context: <none>
PSP slot: caller.sp+30
GenericInst slot: <none>
Varargs: 0
Frame pointer: rbp
Wants Report Only Leaf: 1
Size of parameter area: 30
Code size: 436
Untracked: +rbp+20 +rbp-48
00000034 interruptible
00000034 +rbp+10 +rcx
0000004c +r8
00000057 -r8
0000005c -rbp+10 -rcx
00000066 +r8
00000073 -r8
00000074 +rbp+10 +rcx
0000007a -rcx
000000b1 +rcx
000000c7 -rcx

VSadov avatar Mar 24 '25 22:03 VSadov

@mikem8361 to take a closer look from the SOS side

@VSadov - SOS needs to work with both new and old versions of .NET and it wasn't clear to me if the current PR handles that? I noticed over on https://github.com/dotnet/diagnostics/issues/4795 that discussion around that was already happening so great 👍

noahfalk avatar Mar 24 '25 23:03 noahfalk

and now the new SPS.dll can also decode gcinfo in a net8.0 process:

The same System.Threading.ManualResetEventSlim.Wait on net8.0 runtime produces:

0:000> !gcinfo 00007ffeb445893e 
entry point 00007FFEB4458780
preJIT generated code
GC info 00007FFEB417C778
Pointer table:
Prolog size: 0
GS cookie: <none>
PSPSym: initial.sp+30
Generics inst context: <none>
PSP slot: caller.sp+30
GenericInst slot: <none>
Varargs: 0
Frame pointer: rbp
Wants Report Only Leaf: 1
Size of parameter area: 30
Return Kind: Scalar                            <-- reports Return Kind for old format that tracks it.
Code size: 460
Untracked: +rbp+20 +rbp-48
00000035 interruptible
00000035 +rbp+10 +rcx
0000004d +rax
00000057 -rax
00000069 -rcx
000000a4 +rcx
000000be -rcx
000000c2 +rcx
000000c8 -rcx
000000ce +rax(interior)
. . .

VSadov avatar Mar 25 '25 06:03 VSadov

Looks like an unrelated failure: System.Net.Sockets.SocketException : Connection refused

VSadov avatar Mar 25 '25 16:03 VSadov

I think the change has all that was planned. Please take a look. CC: @jkotas

VSadov avatar Mar 25 '25 16:03 VSadov

This code can tell you if it is .NET Framework:

if (g_pRuntime->GetRuntimeConfiguration() == IRuntime::WindowsDesktop)

That does not want to work at the level of GC Info Decoder. I noticed that all the calls to this in at the strike.cpp level. I think we can check for .NET FX for the purpose of GC Info version at strike.cpp level as well.

We do not need .NET FX detection in the Core repo anyways. As long as decoder specialcases V1 and V1 is set somehow for .Net FX, this should work as before.

VSadov avatar Mar 27 '25 02:03 VSadov

I am not sure how to test .NET FX support in SOS though.

The same thing as I try for Core (i.e. load SOS.dll in WinDbg and examine some methods) does not work for me. I am getting errors when trying to load.

0:000> .load E:\diagnostics\diagnostics\artifacts\bin\Windows_NT.x64.Release\sos.dll
The call to LoadLibrary(E:\diagnostics\diagnostics\artifacts\bin\Windows_NT.x64.Release\sos.dll) failed, Win32 error 0n193
    "%1 is not a valid Win32 application."
Please check your debugger configuration and/or network access.
Extension DLL search Path:
    C:\Program Files\WindowsApps\Microsoft.WinDbg_1.2502.25002.0_x64__8wekyb3d8bbwe\x86\WINXP;C:\Program Files\WindowsApps\Microsoft.WinDbg_1.2502.25002.0_x64__8wekyb3d8bbwe\x86\winext;C:\Program Files\WindowsApps\Microsoft.WinDbg_1.2502.25002.0_x64__8wekyb3d8bbwe\x86\winext\arcade;C:\Program Files\WindowsApps\Microsoft.WinDbg_1.2502.25002.0_x64__8wekyb3d8bbwe\x86\pri;C:\Program 

. . . lots of directories in search path . . .

You may also consider deploying your extension to the UserExtensions extension gallery repository
located at %LOCALAPPDATA%\dbg\UserExtensions folder. It would require an extension manifest.
Error: Failed to load extension E:\diagnostics\diagnostics\artifacts\bin\Windows_NT.x64.Release\sos.dll

Same error happens with the SOS.dll from C:\Users\<user>\.dotnet\sos

I do not think this is related to the change. Maybe I am using this incorrectly... (I am not a frequent user of SOS)

VSadov avatar Mar 27 '25 02:03 VSadov

Figured why I could not test. .Net FX defaults to 32bit and thus I had a bitness mismatch.

Also I found that the change that supposedly "fixes" !gcinfo on .Net FX actually breaks it.

The reason is that the switch from GC Info v1 --> GC info v2 seems like happened in .Net FX times.

The NDP sources that I have (which are by themselves very old) have (NDP\clr\src\inc\gcinfo.h)

#define GCINFO_VERSION 2

. . . 

    static UINT32 ReadyToRunVersionToGcInfoVersion(UINT32 readyToRunMajorVersion)
    {
        // GcInfo version is 1 up to ReadyTorun version 1.x
        // GcInfo version is current from  ReadyToRun version 2.0
        return (readyToRunMajorVersion == 1) ? 1 : GCINFO_VERSION;
    }

Since R2R is a CoreClr thing, does this mean that GCInfo v1 is effectively dead - either on .Net FX or on CoreCLR ?

VSadov avatar Mar 27 '25 06:03 VSadov

GCInfo v1 is effectively dead - either on .Net FX or on CoreCLR ?

Sounds like it.

jkotas avatar Mar 27 '25 06:03 jkotas

I will revert to the commit that does not handle GC_INFO_FLAGS_BIT_SIZE_VERSION_1

VSadov avatar Mar 27 '25 06:03 VSadov

@mikem8361 The runtime part of this change is ready to go. Please take one more look from diagnostics side.

VSadov avatar Mar 27 '25 19:03 VSadov

Thanks!!

VSadov avatar Apr 04 '25 02:04 VSadov

Looks like I do not have permissions to merge in this repo, so someone else should merge.

VSadov avatar Apr 04 '25 02:04 VSadov