NullReferenceException in OutsideRuntimeClient
We've got an Orleans Silo running in Azure in a Linux container app, with a client Azure function app that's pulling from a queue to process data. This runs fine if messages are processed one at a time, but when multiple messages are processed at the same time and the function app is trying to send messages to the same grain concurrently, we get the following exception:
Exception while executing function: Functions.IngestPricesTrigger Result: Failure
Exception: System.AggregateException: One or more errors occurred. (Object reference not set to an instance of an object.)
---> System.NullReferenceException: Object reference not set to an instance of an object.
at Orleans.OutsideRuntimeClient.SendRequestMessage(GrainReference target, Message message, IResponseCompletionSource context, InvokeMethodOptions options) in //src/Orleans.Core/Runtime/OutsideRuntimeClient.cs:line 244
at Orleans.OutsideRuntimeClient.SendRequest(GrainReference target, IInvokable request, IResponseCompletionSource context, InvokeMethodOptions options) in //src/Orleans.Core/Runtime/OutsideRuntimeClient.cs:line 236
at Orleans.Runtime.GrainReferenceRuntime.InvokeMethodAsync[TResult](GrainReference reference, IInvokable request, InvokeMethodOptions options) in //src/Orleans.Core/Runtime/GrainReferenceRuntime.cs:line 45
at Orleans.Runtime.GrainReference.InvokeAsync[T](IInvokable methodDescription) in //src/Orleans.Core.Abstractions/Runtime/GrainReference.cs:line 413
at OrleansCodeGen.Engine.GrainInterfaces.Proxy_IPriceSourceValidationGrain.global::Engine.GrainInterfaces.IPriceSourceValidationGrain.GetPriceSourceValidationRules() in C:\agents\03_work\4\s\Engine.GrainInterfaces\Orleans.CodeGenerator\Orleans.CodeGenerator.OrleansSerializationSourceGenerator\Engine.GrainInterfaces.orleans.g.cs:line 415
at Engine.Data.Orleans.Repositories.OrleansPriceSourceRepository.GetPriceSourceValidationRules(Int32 priceSourceId) in C:\agents\03_work\4\s\Engine.Data.Orleans\Repositories\OrleansPriceSourceRepository.cs:line 44
at Engine.Logic.Implementations.PriceValidationService.ValidateUnprocessedPrices(UnprocessedPrice[] prices, Int32 priceSourceId) in C:\agents\03_work\4\s\Engine.Logic\Implementations\PriceValidationService.cs:line 28
at Engine.Logic.Implementations.PricingService.ProcessBatchPrices(UnprocessedPrice[] unprocessedPrices, Int32 priceSourceId, DateTime receviedDate, Nullable1 reportedDate, String fileName, Nullable1 userId, Guid processGuid) in C:\agents\03_work\4\s\Engine.Logic\Implementations\PricingService.cs:line 40
at Engine.Functions.Triggers.IngestPricesTrigger.Run(IngestPriceBatchCommand payload) in C:\agents\03_work\4\s\Engine.Functions\Triggers\IngestPricesTrigger.cs:line 26
at Microsoft.Azure.Functions.Worker.Invocation.VoidTaskMethodInvoker2.InvokeAsync(TReflected instance, Object[] arguments) in D:\a\_work\1\s\src\DotNetWorker.Core\Invocation\VoidTaskMethodInvoker.cs:line 22 --- End of inner exception stack trace --- at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task1.GetResultCore(Boolean waitCompletionNotification)
at System.Threading.Tasks.Task1.get_Result() at Microsoft.Azure.Functions.Worker.Invocation.DefaultFunctionInvoker2.<>c.<InvokeAsync>b__6_0(Task1 t) in D:\a\_work\1\s\src\DotNetWorker.Core\Invocation\DefaultFunctionInvoker.cs:line 32 at System.Threading.Tasks.ContinuationResultTaskFromResultTask2.InnerInvoke()
at System.Threading.Tasks.Task.<>c.<.cctor>b__273_0(Object obj)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location ---
at Microsoft.Azure.Functions.Worker.Invocation.DefaultFunctionExecutor.ExecuteAsync(FunctionContext context) in D:\a_work\1\s\src\DotNetWorker.Core\Invocation\DefaultFunctionExecutor.cs:line 44
at Microsoft.Azure.Functions.Worker.OutputBindings.OutputBindingsMiddleware.Invoke(FunctionContext context, FunctionExecutionDelegate next) in D:\a_work\1\s\src\DotNetWorker.Core\OutputBindings\OutputBindingsMiddleware.cs:line 13
at Microsoft.Azure.Functions.Worker.GrpcWorker.InvocationRequestHandlerAsync(InvocationRequest request, IFunctionsApplication application, IInvocationFeaturesFactory invocationFeaturesFactory, ObjectSerializer serializer, IOutputBindingsInfoProvider outputBindingsInfoProvider, IInputConversionFeatureProvider functionInputConversionFeatureProvider) in D:\a_work\1\s\src\DotNetWorker.Grpc\GrpcWorker.cs:line 199
Stack: at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task1.GetResultCore(Boolean waitCompletionNotification) at System.Threading.Tasks.Task1.get_Result()
at Microsoft.Azure.Functions.Worker.Invocation.DefaultFunctionInvoker2.<>c.<InvokeAsync>b__6_0(Task1 t) in D:\a_work\1\s\src\DotNetWorker.Core\Invocation\DefaultFunctionInvoker.cs:line 32
at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
at System.Threading.Tasks.Task.<>c.<.cctor>b__273_0(Object obj)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location ---
at Microsoft.Azure.Functions.Worker.Invocation.DefaultFunctionExecutor.ExecuteAsync(FunctionContext context) in D:\a_work\1\s\src\DotNetWorker.Core\Invocation\DefaultFunctionExecutor.cs:line 44
at Microsoft.Azure.Functions.Worker.OutputBindings.OutputBindingsMiddleware.Invoke(FunctionContext context, FunctionExecutionDelegate next) in D:\a_work\1\s\src\DotNetWorker.Core\OutputBindings\OutputBindingsMiddleware.cs:line 13
at Microsoft.Azure.Functions.Worker.GrpcWorker.InvocationRequestHandlerAsync(InvocationRequest request, IFunctionsApplication application, IInvocationFeaturesFactory invocationFeaturesFactory, ObjectSerializer serializer, IOutputBindingsInfoProvider outputBindingsInfoProvider, IInputConversionFeatureProvider functionInputConversionFeatureProvider) in D:\a_work\1\s\src\DotNetWorker.Grpc\GrpcWorker.cs:line 199
This is the code that's used to get the grain. PriceSourceId is never going to be null, so that's not the issue.
var priceSourceValidation = _orleansClusterClient.GetGrain<IPriceSourceValidationGrain>(priceSourceId);
return await priceSourceValidation.GetPriceSourceValidationRules();
We're using Azure table storage for the membership table.
How are you configuring the client? The client is thread safe, but there's a possibility that the client hasn't fully started before being accessed.
Thanks for the reply, this is the code we're using to configure the client:
.UseOrleansClient((context, builder) =>
{
var siloConfig = context.Configuration.GetSection(nameof(SiloSettings)).Get<SiloSettings>();
builder
.UseAzureStorageClustering(opt =>
{
opt.ConfigureTableServiceClient(context.Configuration.GetConnectionString("EngineIngestionStorageAccount"));
})
.Configure<ClusterOptions>(o =>
{
o.ClusterId = siloConfig.ClusterId;
o.ServiceId = siloConfig.ServiceId;
});
})
We'll do some testing and see if this is something that's only occurring when the client is starting up.
We've done more testing and only managed to replicate the issue when the function app is started / restarted, so it looks like it is an issue with the startup.
I've tried a work around that appears to be working, by using a retry policy on the Azure function trigger and increasing the delay so that the Orleans client has enough time to start. It would be useful though if there was some way to ensure that the Azure service bus trigger is only executed once the Orleans client has started, so we didn't have to rely on retries.
I ran into the same issue. I am using local clustering (for now).
In my case, the access to the Orleans cluster is via a hosted service running on an external client (asp.net host).
It turns out the order of service injection on the client matters. I solved this issue but inserting the hosted service after Orleans client configuration.
Before (did not work):
Host
.CreateDefaultBuilder(args)
.ConfigureWebHostDefaults(fun wb ->
wb
.UseStaticWebAssets()
.UseStartup<Startup>() //<-- Service that references Grain injected here
|> ignore
)
.UseOrleansClient(fun oc ->
oc
.UseLocalhostClustering()
.AddMemoryStreams(AmfGrains.C.Streams.PROVIDER)
|> ignore
)
.UseConsoleLifetime()
.Build()
.Run()
Fix:
Host
.CreateDefaultBuilder(args)
.UseOrleansClient(fun oc ->
oc
.UseLocalhostClustering()
.AddMemoryStreams(AmfGrains.C.Streams.PROVIDER)
|> ignore
)
.ConfigureWebHostDefaults(fun wb ->
wb
.UseStaticWebAssets()
.UseStartup<Startup>() //<-- Service that references Grain injected here
|> ignore
)
.UseConsoleLifetime()
.Build()
.Run()
Holy cow! Pardon my language but @fwaris this was it. I just spent a whole day to figure out what I did wrong. And in fact this was it! Can we get a disclaimer or some banner in the docs that points that out? I checked the samples and at the Stocks sample and GPSTracker sample this is wired up as below but there is no comment that points that out and I've only checked the streaming sample as I was curious about that particular part. Never would I have searched for "AddHostedService" inside the samples.
My case is a simple console application to test something and I added a HostedService before the UseOrleansClient and asked myself why InternalGrainFactory always was null inside the ClusterClient.cs. Refactoring the code to add builder.Services.AddHostedService<Worker>(); after the builder.UseOrleansClient() call did fix it.
var builder = Host.CreateApplicationBuilder(args);
// builder.Services.AddHostedService<Worker>(); <--- Don't do this
builder.UseOrleansClient((clientBuilder) =>
{
clientBuilder.Configure<ClusterOptions>(options =>
{
options.ClusterId = "Cluster";
options.ServiceId = typeof(Program).FullName;
});
var clusterConnectionString = builder.Configuration["ConnectionStrings:ClusterDatabase"]!;
clientBuilder.UseCosmosGatewayListProvider(options =>
{
options.ConfigureCosmosClient(clusterConnectionString);
options.ClientOptions = new CosmosClientOptions
{
ServerCertificateCustomValidationCallback = (certificate, chain, sslPolicyErrors) => true,
};
options.DatabaseName = "cluster-db";
});
});
builder.Services.AddHostedService<Worker>(); // It has do be after .UseOrleansClient
using var app = builder.Build();
await app.RunAsync();
Maybe this should be added as a hint to the documentation ?