runtime icon indicating copy to clipboard operation
runtime copied to clipboard

AF: *(_UNCHECKED_OBJECTREF *)handle == NULL (HndCreateHandle called by getJitHandleForObject)

Open SakeTao opened this issue 6 months ago • 8 comments

Failed in: runtime-coreclr gcstress-extra 20250629.1

Failed tests:

coreclr osx arm64 Checked gcstress0xc_disabler2r @ OSX.13.Arm64.Open
    - JIT/Regression/JitBlue/DevDiv_461649/DevDiv_461649/DevDiv_461649.cmd

Build Information

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=1081377&view=results Build error leg or test failing:

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorMessage": "*(_UNCHECKED_OBJECTREF *)handle == NULL",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Known issue validation

Build: :mag_right: Result validation: :warning: Validation could not be done without an Azure DevOps build URL on the issue. Please add it to the "Build: :mag_right:" line. Validation performed at: 9/28/2025 6:05:32 PM UTC

Report

Build Definition Test Pull Request
1224397 dotnet/runtime ComInterfaceGenerator.Unit.Tests.WorkItemExecution dotnet/runtime#121853
1222799 dotnet/runtime Methodical_d1.WorkItemExecution
1221587 dotnet/runtime System.Runtime.Numerics.Tests.WorkItemExecution dotnet/runtime#120330
1214506 dotnet/runtime System.Runtime.Numerics.Tests.WorkItemExecution dotnet/runtime#122025
1214273 dotnet/runtime System.Runtime.Numerics.Tests.WorkItemExecution dotnet/runtime#121986
1214171 dotnet/runtime System.Private.Xml.Tests.WorkItemExecution dotnet/runtime#120866
1213242 dotnet/runtime System.Runtime.Numerics.Tests.WorkItemExecution dotnet/runtime#121956
1213192 dotnet/runtime System.Collections.Tests.WorkItemExecution dotnet/runtime#121976
1209160 dotnet/runtime System.Threading.Tasks.Parallel.Tests.WorkItemExecution
1204618 dotnet/runtime System.Runtime.Numerics.Tests.WorkItemExecution dotnet/runtime#121679

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
1 3 10

SakeTao avatar Jun 30 '25 02:06 SakeTao

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

This hasn't hit in the last few runs. Removing blocking-optional tag.

amanasifkhalid avatar Jul 28 '25 15:07 amanasifkhalid

This recently hit on win-x64 in our innerloop tests, though it's not widespread. I'm unable to repro it locally, possibly due to xunit scrambling test execution order. @EgorBo based on what I see in the crash dump, your hypothesis that this assert is hitting in the getStaticFieldContent lookup is correct. I think the static field is an empty array; the dump is for a Checked JIT and thus isn't too enlightening, but I can see its type name is System.Type.EmptyTypes. Could you PTAL?

amanasifkhalid avatar Aug 11 '25 21:08 amanasifkhalid

I'm not able to reproduce this locally and am not sure it's a JIT issue, the stack trace looks like the following:


Assert failure(PID 7236 [0x00001c44], Thread: 9044 [0x2354]): *(_UNCHECKED_OBJECTREF *)handle == NULL

CORECLR! HndCreateHandle + 0x205 (0x00007ffb`cc86dfe5)
CORECLR! GCHandleStore::CreateHandleOfType + 0xB4 (0x00007ffb`cc908524)
CORECLR! CEEInfo::getJitHandleForObject + 0x1CF (0x00007ffb`cc39cb8f)
CORECLR! CEEInfo::getStaticObjRefContent + 0x189 (0x00007ffb`cc3a3cb9)
CORECLR! CEEInfo::getStaticFieldContent + 0x2E6 (0x00007ffb`cc3a3436)
CLRJIT! ValueNumStore::VNForFunc + 0x24C (0x00007ffb`e89d992c)
CLRJIT! ValueNumStore::VNPairForFunc + 0x3E (0x00007ffb`e89ddcce)
CLRJIT! Compiler::fgValueNumberTree + 0xBF4 (0x00007ffb`e8a2a4b4)
CLRJIT! Compiler::fgValueNumberBlock + 0x752 (0x00007ffb`e8a23d92)
CLRJIT! Compiler::fgValueNumberBlocks + 0x257 (0x00007ffb`e8a241b7)
    File: D:\a\_work\1\s\src\coreclr\gc\handletable.cpp:306
    Image: C:\h\w\C5E90ACD\p\dotnet.exe

So JIT compiles a method, it sees a static readonly X[] array field, and it needs to know its length.

  1. It asks VM via JIT-API getStaticFieldContent for its content (with ignoreMovableObjects=false meaning it's fine to return a handle to a movable object as we only need it during the JIT time to call getArrayOrStringLength over it).
  2. getStaticFieldContent calls getJitHandleForObject for our movable object
  3. The object doesn't belong to the NonGC heap so we have to create a handle for it via AppDomain::GetCurrentDomain()->CreateHandle and add that handle to a list m_pJitHandles (list of jit compilation time handles)
  4. Once the JIT compilation is finished, we destroy all handles in m_pJitHandles

Now the question is, why CreateHandle fails somewhere inside it in HndCreateHandle with "the content of the newly allocated handle is not null" ? Did someone forgot to destroy a handle or it's some race condition?

EgorBo avatar Sep 28 '25 12:09 EgorBo

The most likely cause of this assert is GC handle double-free. The only other hit in this repo is https://github.com/dotnet/runtime/pull/55596#issuecomment-880137400 that was GC handle double-free bug.

It is interesting that all hits that I have seen are with JIT/EE interface on the stack, across multiple different tests. If we had a double-free bug at some random place in runtime or libraries, I would expect the stacks to vary.

jkotas avatar Sep 28 '25 18:09 jkotas

The most likely cause of this assert is GC handle double-free. The only other hit in this repo is #55596 (comment) that was GC handle double-free bug.

It is interesting that all hits that I have seen are with JIT/EE interface on the stack, across multiple different tests. If we had a double-free bug at some random place in runtime or libraries, I would expect the stacks to vary.

From my understanding it shouldn't be caused by that JIT usage as we carefully create handles only here, register them in a list and then destroy them in CEEInfo's destructor here 🤔

EgorBo avatar Sep 28 '25 21:09 EgorBo

@jkotas in case if it rings a bell, additional info from the memory dump (this):

Full stack-trace: https://gist.github.com/EgorBo/c3214d6bdb9fcb4f409884b3a5402251 I guess the test where it fails is System_Memory_Tests!System.SpanTests.ReadOnlySpanTests.TestMultipleMatchLastIndexOfAny_String_TwoByte and OSR is involved (if that matters). I can confirm that JIT just tries to obtain a temporarily CORINFO_OBJECT_HANDLE for private static readonly string?[] s_smallNumberCache = new string[SmallNumberCacheLength]; // SmallNumberCacheLength=300 field to then obtain its length only.

The object that the newly allocated handle still points to (instead of being null) seems to be:

0:008> !do 0x0000019ff8411768
Name:        System.String[]
MethodTable: 00007ffb6d45ec78
Canonical MethodTable: 00007ffb6ccff108
Tracked Type: false
Size:        2424(0x978) bytes
Array:       Rank 1, Number of elements 300, Type CLASS (Print Array)
Fields:
None

Interestingly, the object that was abandoned also contains 300 elements 🤔

EgorBo avatar Sep 29 '25 04:09 EgorBo

In the dumps that I have looked at, I have seen s_smallNumberCache to be often involved as either the handle value or the object being wrapped, but it has not been a uniform pattern. I have seen other objects as well. I think s_smallNumberCache tends to show up because integer formatting is frequently inlined in tiered compiled code.

jkotas avatar Sep 29 '25 04:09 jkotas