dnceng icon indicating copy to clipboard operation
dnceng copied to clipboard

Roslyn analyzer throws error AD0001 NullReferenceException

Open akoeplinger opened this issue 1 year ago • 24 comments

Build

https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=733128

Build leg reported

VMR Vertical Build / Ubuntu2404_DevVersions_x64 / Build

Pull Request

https://github.com/dotnet/sdk/pull/42019

Known issue core information

Fill out the known issue JSON section by following the step by step documentation on how to create a known issue

 {
    "ErrorMessage" : "",
    "BuildRetry": true,
    "ErrorPattern": "error AD0001: Analyzer.*threw an exception of type 'System.NullReferenceException'",
    "ExcludeConsoleLog": false
 }

@dotnet/dnceng

Release Note Category

  • [ ] Feature changes/additions
  • [ ] Bug fixes
  • [ ] Internal Infrastructure Improvements

Release Note Description

Additional information about the issue reported

No response

Known issue validation

Build: :mag_right: https://dev.azure.com/dnceng-public/public/_build/results?buildId=733128 Error message validated: [error AD0001: Analyzer.*threw an exception of type 'System.NullReferenceException'] Result validation: :white_check_mark: Known issue matched with the provided build. Validation performed at: 7/8/2024 7:22:28 PM UTC

Report

Build Definition Step Name Console log Pull Request
1036955 dotnet/runtime Build product Log dotnet/runtime#115395
1036957 dotnet/runtime Build product Log dotnet/runtime#115395
1036913 dotnet/runtime Build product Log dotnet/runtime#115395
1036846 dotnet/runtime Build product Log dotnet/runtime#115395
1036847 dotnet/runtime Build product Log dotnet/runtime#115395

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 5

akoeplinger avatar Jul 08 '24 19:07 akoeplinger

Hmm... This error showed up in https://github.com/dotnet/sdk/pull/36807 as well:

CSC : error AD0001: Analyzer 'Microsoft.NetCore.CSharp.Analyzers.Runtime.CSharpDetectPreviewFeatureAnalyzer' threw an exception of type 'System.NullReferenceException' with message 'Object reference not set to an instance of an object.'. [/vmr/src/efcore/src/EFCore/EFCore.csproj]

But I don't see it in the above table. Are we missing data?

ViktorHofer avatar Jul 08 '24 20:07 ViktorHofer

It's now in the table, probably just took a bit.

akoeplinger avatar Jul 09 '24 07:07 akoeplinger

Interesting, I was under the impression that this only affects VMR builds but apparently this is more widespread.

ViktorHofer avatar Jul 09 '24 09:07 ViktorHofer

Yeah. I queried Kusto and we hit this 133 times over the last 60 days just in build logs.

akoeplinger avatar Jul 09 '24 09:07 akoeplinger

Should we add "BuildRetry": false to the known issue pattern here? I haven't used that feature but it may help.

ericstj avatar Jul 10 '24 17:07 ericstj

Yeah. Done.

akoeplinger avatar Jul 10 '24 18:07 akoeplinger

Not sure what is going on but a significant chunk of the hits here seem to be false positives. Just spent 10 minutes digging through builds and can't see the error on 75% of them.

jaredpar avatar Jul 19 '24 19:07 jaredpar

Many of the builds are attributed to dotnet-runtime when it's actually dotnet-runtime-perf. Also the results look like this ...

image

If the compiler is indeed throwing there it's very hard to dig through to the failure.

jaredpar avatar Jul 19 '24 19:07 jaredpar

Cross posting the analysis from the linked bug on roslyn-analyzers

This is the stack of the NullReferenceException in at least one case:

System.NullReferenceException: Object reference not set to an instance of an object.
at System.Collections.Concurrent.ConcurrentDictionary`2.TryRemoveInternal(TKey key, TValue& value, Boolean matchValue, TValue oldValue)
at Microsoft.CodeQuality.Analyzers.Maintainability.AvoidUnusedPrivateFieldsAnalyzer.<>c__DisplayClass5_0.<Initialize>b__2(OperationAnalysisContext operationContext)
at Microsoft.CodeAnalysis.Diagnostics.AnalyzerExecutor.ExecuteAndCatchIfThrows_NoLock[TArg](DiagnosticAnalyzer analyzer, Action`1 analyze, TArg argument, Nullable`1 info, CancellationToken cancellationToken)

That almost certainly represents this line in the roslyn analyzers code:

IFieldSymbol field = ((IFieldReferenceOperation)operationContext.Operation).Field;
if (field.DeclaredAccessibility == Accessibility.Private)
{
    referencedPrivateFields.TryAdd(field, default);
    // Error is here. 
    maybeUnreferencedPrivateFields.TryRemove(field, out _);
}

Both values here are non-null:

  • maybeUnreferencedPrivateFields: is single assign and initialized to non-null at declaration
  • field: is used above this line several times without null reffing.

That seems like a runtime bug.

jaredpar avatar Jul 19 '24 20:07 jaredpar

@akoeplinger, @ericstj, @ViktorHofer at least the variation I'm seeing above appears to be a runtime bug. Do we want to use this issue to track that or file a new one?

jaredpar avatar Jul 22 '24 17:07 jaredpar

I think a new one since the pattern observed here (single log statement with NRE) can't necessarily tie it to the single analyzer. Do you have a dump to help triage the runtime issue or is it just based on the callstack?

ericstj avatar Jul 22 '24 17:07 ericstj

can't necessarily tie it to the single analyzer.

The log statements are always single line (for reasons I don't understand). But if you get the matching binlog you can usually see the full stack trace if you dig down into the messages.

Do you have a dump to help triage the runtime issue or is it just based on the callstack?

This is just based on call stacks. It manifests as an exception in the analyzer and by default compiler catches those and issues a warning.

The compiler can be configured to fail fast when this happens by setting the following msbuild property

<Features>$(Features);debug-analyzers</Features>

The failures are mostly coming on the runtime pipeline builds so you'd need to be setup to catch crash dumps on process FailFast. Doing that and we should get a dump in a few days.

jaredpar avatar Jul 22 '24 19:07 jaredpar

Another variation of the NRE looks like this

Exception occurred with following context:
Compilation: Microsoft.CodeAnalysis.Razor.Compiler
IOperation: Invocation
SyntaxTree: /vmr/src/razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/Syntax/Generated/Syntax.xml.Syntax.Generated.cs
SyntaxNode: GetAnnotations() [InvocationExpressionSyntax]@[6078..6094) (208,30)-(208,46)
System.NullReferenceException: Object reference not set to an instance of an object.
at Microsoft.NetCore.Analyzers.Runtime.DetectPreviewFeatureAnalyzer.GetOperationSymbol(IOperation operation)
at Microsoft.NetCore.Analyzers.Runtime.DetectPreviewFeatureAnalyzer.OperationUsesPreviewFeatures(OperationAnalysisContext context, ConcurrentDictionary`2 requiresPreviewFeaturesSymbols, INamedTypeSymbol previewFeatureAttributeSymbol, ISymbol& referencedPreviewSymbol)
at Microsoft.NetCore.Analyzers.Runtime.DetectPreviewFeatureAnalyzer.<>c__DisplayClass33_0.<Initialize>b__1(OperationAnalysisContext context)
at Microsoft.CodeAnalysis.Diagnostics.AnalyzerExecutor.ExecuteAndCatchIfThrows_NoLock[TArg](DiagnosticAnalyzer analyzer, Action`1 analyze, TArg argument, Nullable`1 info, CancellationToken cancellationToken)

That is basically down to this block of code. That code on it's own (no inlining) is very hard to see a NRE on. Suspect that there is some amount of inlining going on here.

The invocation being analyzed here is the GetAnnotations() call on this line. That means we should be at this point in the code

        private static ISymbol? GetOperationSymbol(IOperation operation)
            => operation switch
            {
                // EXECUTION SHOULD BE HERE
                IInvocationOperation iOperation => iOperation.TargetMethod,
                IObjectCreationOperation cOperation => cOperation.Constructor,

Basically that should be an InvocationOperation and given that it's sealed and only impl of IInvocationOperation it is likely a candidate for inlining. At the same time the TargetMethod is an auto-implemented property and shouldn't ever null ref itself.

This a very puzzling one to understand. Seems like another case we'd need the debug-analyzers flag to get a dump and track down.

@333fred in case he can see any flaw in my analysis here.

jaredpar avatar Jul 22 '24 22:07 jaredpar

That is basically down to this block of code. That code on it's own (no inlining) is very hard to see a NRE on.

Agreed. Looking at that block, there shouldn't be an opportunity for a null ref there, everything is null-safe. Even inlining in that location shouldn't result in a null-ref; looking through every property called by that method, directly or indirectly, they're auto-props, and the instances they're called on are checked for null beforehand. The only thing in that closure that isn't an extremely simple auto-prop is the call to arrayTypeSymbol.ElementType, but it's also extremely hard to see where that property would ever null-ref; further, it's not a good candidate for inlining, since there are several implementations. Agreed that we need more data to troubleshoot further.

333fred avatar Jul 22 '24 23:07 333fred

Note: the results from the aspnetcore-quarantined-pr pipeline aren't actionable. They don't upload any binlogs when the build fails so we can't see what is happening.

@captainsafia who on aspnetcore owns this pipeline?

jaredpar avatar Jul 23 '24 16:07 jaredpar

@captainsafia who on aspnetcore owns this pipeline?

AFAIK, the build isn't configured to produce binlogs at the moment (ref). @wtgodbe can help with making a change here to produce binlogs for further investigation.

captainsafia avatar Jul 23 '24 16:07 captainsafia

Started compiling all of the results from looking at the failures into this gist

jaredpar avatar Jul 23 '24 17:07 jaredpar

AFAIK, the build isn't configured to produce binlogs at the moment (ref). @wtgodbe can help with making a change here to produce binlogs for further investigation.

Note: the aspnetcore-ci pipeline uploads logs but it overwrites them on retry. That means if we hit any of these failures, then retry the logs of the failure are effectively deleted. That means we can't really get any info from any of the aspnetcore pipelines here.

jaredpar avatar Jul 23 '24 17:07 jaredpar

Have a gist where I've summarized the diff errors I'm seeing. Dug into five of them.

  • Roughly three are NRE where it's very hard to see how there could be an NRE
  • One I can only narrow down to a medium sized method in rosyln-analyzers. That method is hardened against a lot of null but given the size it's harder to say with confidence what is happening here.
  • One is a cast exception the compiler. In isolation that feels like a compiler bug. In context I wonder if it could be related to the underlying issue we're seeing.

There are 2-3 other analyzer that are producing AD0001 diagnostics that I haven't bothered to dig into.

I think the next steps are to get the owners of dotnet-unified-build and sdk-unified-build to enable compiler crash on analyzer exception and collection of dump logs so we can get a better idea what is going on here.

jaredpar avatar Jul 23 '24 17:07 jaredpar

Can we please resolve this as a dupe of https://github.com/dotnet/runtime/issues/104123? That is what the runtime team's investigation lead them to. The fix is in PR and will be back ported to 9.0 P7

jaredpar avatar Jul 26 '24 16:07 jaredpar

the issue just got reopened, I also think we should keep this open for a bit so we have Build Analysis tracking

akoeplinger avatar Jul 26 '24 20:07 akoeplinger

I am also running into this (and some variants) when building the .NET 9 Preview 7 using the VMR on a number of arm64 platforms using 9.0.100-preview.7.24380.2 as the build SDK.

Like I mentioned in https://github.com/dotnet/source-build/issues/4555 I am seeing a number of variants of errors:

  • Analyzer 'Microsoft.Interop.Analyzers.CustomMarshallerAttributeAnalyzer' threw an exception of type 'System.InvalidProgramException' with message 'Common Language Runtime detected an invalid program
  • CSC : error AD0001: Analyzer 'Microsoft.NetCore.CSharp.Analyzers.Runtime.CSharpDetectPreviewFeatureAnalyzer' threw an exception of type 'System.NullReferenceException' with message 'Object reference not set to an instance of an object.'
  • error MSB6006: "csc.dll" exited with code 139

omajid avatar Aug 14 '24 16:08 omajid

This as a dupe of dotnet/runtime#104123. I don't have permissions to resolve the issue.

jaredpar avatar Aug 15 '24 05:08 jaredpar

https://github.com/dotnet/source-build/issues/4576 - I suspect that SB just encountered this error in one of our 9.0 builds

ellahathaway avatar Aug 23 '24 16:08 ellahathaway

Build Analysis shows no more hits in the last 30 days, closing

akoeplinger avatar Aug 21 '25 16:08 akoeplinger