azure-functions-host icon indicating copy to clipboard operation
azure-functions-host copied to clipboard

Problems with scoped lifetime services (AddScoped)

Open heikkilamarko opened this issue 5 years ago • 106 comments

Hi,

We are using durable functions with dependency injection. In our Startup.cs file, we register our dependencies as follows:

services.AddScoped<IMyService>(serviceProvider => new MyService(...))

In our activity function, we are using normal constructor injection to get IMyService instances.

The problem is that even if we are using AddScoped for registering the service, during the activity function run, each class that asks for the service, gets a different instance of IMyService. This breaks our app logic, because IMyService users won't see each other's changes.

As a workaround to the earlier Azure Functions runtime DI issues, we had the following pinning in place FUNCTIONS_EXTENSION_VERSION = 2.0.12673.0. The pinned version is not supported anymore by Azure, so our function DI is now broken again.

heikkilamarko avatar Oct 17 '19 18:10 heikkilamarko

I find it hard to believe this is by design. If it is so, documentation should state that very clearly.

heikkilamarko avatar Oct 18 '19 05:10 heikkilamarko

The problem is not durable function specific. Same happens with normal TimerTrigger functions also.

heikkilamarko avatar Oct 18 '19 06:10 heikkilamarko

I made more experiments and it seem that the DI problem occurs only when I have used services.AddHttpClient() to register HttpClients. If I comment out services.AddHttpClient() calls, AddScoped works as expected.

heikkilamarko avatar Oct 18 '19 07:10 heikkilamarko

Hi @heikkilamarko I am also facing the same issue, when this issue happens last week I have pinned to older runtime version, then it started working, now again breaking. I agree it's not an issue with durable function. Even I am using the same DI pattern for HttpClients, Do we have any workaround for httpclient register or waiting for team to provide global fix

dhanapalschandran avatar Oct 19 '19 00:10 dhanapalschandran

@dhanapalschandran Same thing on my side. Could it be breaking again due to #5096?

unt1tled avatar Oct 20 '19 20:10 unt1tled

@heikkilamarko can you please share repro code? Is this happening with factory registrations only? Some more details/repro will ensure we're looking exactly at the issue you're running into.

@unt1tled / @dhanapalschandran if you can share the code you're having issues with, we can validate with your scenarios as well.

Thanks!

fabiocav avatar Nov 01 '19 04:11 fabiocav

Hi. See below for a repro code (see the comment block in the function code):

using Microsoft.Azure.Functions.Extensions.DependencyInjection;
using Microsoft.Azure.WebJobs;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using System.Net.Http;

[assembly: FunctionsStartup(typeof(Demo.App.Startup))]

namespace Demo.App
{
    public class Startup : FunctionsStartup
    {
        public override void Configure(IFunctionsHostBuilder builder)
        {
            var services = builder.Services;

            services.AddHttpClient<MyHttpClient>();

            services
                .AddScoped<IMyService, MyService>()
                .AddScoped<IMyServiceSetter, MyServiceSetter>()
                .AddScoped<IMyServiceGetter, MyServiceGetter>();
        }
    }

    public class MyHttpClient
    {
        private readonly HttpClient httpClient;

        public MyHttpClient(HttpClient httpClient)
        {
            this.httpClient = httpClient;
        }
    }

    public interface IMyService
    {
        void Set(int value);

        int Get();
    }

    public class MyService : IMyService
    {
        private int value = 0;

        public int Get()
        {
            return value;
        }

        public void Set(int value)
        {
            this.value = value;
        }
    }

    public interface IMyServiceSetter
    {
        void SetValue(int value);
    }

    public class MyServiceSetter : IMyServiceSetter
    {
        private readonly IMyService myService;
        private readonly IMyServiceGetter myServiceGetter;

        public MyServiceSetter(IMyService myService, IMyServiceGetter myServiceGetter)
        {
            this.myService = myService;
            this.myServiceGetter = myServiceGetter;
        }

        public void SetValue(int value)
        {
            myService.Set(value);
        }
    }

    public interface IMyServiceGetter
    {
        int GetValue();
    }

    public class MyServiceGetter : IMyServiceGetter
    {
        private readonly IMyService myService;
        private readonly MyHttpClient myHttpClient;

        public MyServiceGetter(IMyService myService, MyHttpClient myHttpClient)
        {
            this.myService = myService;
            this.myHttpClient = myHttpClient;
        }

        public int GetValue()
        {
            return myService.Get();
        }
    }

    public class DemoFunction
    {
        private readonly IMyServiceSetter myServiceSetter;
        private readonly IMyServiceGetter myServiceGetter;

        public DemoFunction(
            IMyServiceSetter myServiceSetter,
            IMyServiceGetter myServiceGetter)
        {
            this.myServiceSetter = myServiceSetter;
            this.myServiceGetter = myServiceGetter;
        }

        [FunctionName("Demo")]
        public void Demo([TimerTrigger("0 */5 * * * *", RunOnStartup = true)]TimerInfo timerInfo, ILogger log)
        {
            int value1 = 123;
            myServiceSetter.SetValue(value1);
            int value2 = myServiceGetter.GetValue();

            // Logs '123 - 0'
            // It means that MyServiceGetter and MyServiceSetter got different IMyservice instances.
            // Should be the same since we are using AddScoped to register services.
            // ---
            // If you comment out MyHttpClient injection from MyServiceGetter, it logs '123 - 123'.
            log.LogInformation($"{value1} - {value2}");
        }
    }
}

heikkilamarko avatar Nov 01 '19 07:11 heikkilamarko

I believe this is the same as my issue #4914 - clearly something wrong with scoped registrations.

@fabiocav can we get an update on these tickets? Are they being investigated? Thx

t-l-k avatar Nov 09 '19 12:11 t-l-k

@fabiocav Hope you are having a nice day :) We are also hopeful that this issue will be fixed soon. We have spent a lot of time writing code to wire up our services and we really love the way scoped services work in ASP.NET Core. Let me know if there is anything I can do to help get this moving!

APIWT avatar Nov 12 '19 14:11 APIWT

We have experienced the same problem, but I also noticed this: It's only the first time after the function started/triggered, that the scoped services are 'wrong'. The second time, it seems to be fine. Has anyone else noticed this? (I was able to reproduce it with the code posted here by @heikkilamarko.)

This is a pretty big problem for us as well, any progress on this issue?

Some additional information: Azure Functions Core Tools - 2.7.1846 Function Runtime Version - 2.0.12858.0

buyckkdelaware avatar Nov 15 '19 15:11 buyckkdelaware

@buyckkdelaware during load testing I have found multiple instances of same service type within the same request scope, it occurs frequently for me, at least 1 in 100 requests. I don't know if the weight of the type is relevant, but my DbContext subclasses are definitely affected - I get missed saves because entities get written to an incorrect DbContext, injected into either repositories, or other infrastructure code. I haven't yet managed to run more than 500 requests due to the DryIOC container scope exploding (which appears to be due to DefaultHttpClientFactory, also likely implicated in the HttpClient errors you've witnessed).

Detecting these duplicates is quite easy, using an AsyncLocal static collection in the affected class's constructor to track when more than 1 distinct hashcode has been found in. DbContexts are not thread safe ofc so can't be used by multiple threads anyway.

I haven't yet "handled" it, but I'll probably just throw an exception and fail the request thread using hashcodes as the detection basis. Without handling it, my code sometimes doesn't know that it has failed, it just misses data.

t-l-k avatar Nov 25 '19 21:11 t-l-k

Are there any workarounds for this issue?

darrenhull avatar Jan 06 '20 15:01 darrenhull

@darrenhull the workarounds will require some code refactoring at the moment. This is being assigned to a sprint for investigation, so we hope to have an update on this issue soon.

fabiocav avatar Jan 06 '20 18:01 fabiocav

Hi @fabiocav,

Do you know if there is an update on this issue yet?

Sorry to keep asking but It’s a pretty big blocking issue for my company’s migration which we are pushing further and further away without being able to suggest when we can resume it.

Thanks

darrenhull avatar Jan 23 '20 21:01 darrenhull

Hi @fabiocav,

I see this has been triaged and had the p2 tag. What exactly does the p2 mean, guessing priority2?

Can you give us an update please?

Thanks

darrenhull avatar Feb 20 '20 10:02 darrenhull

@darrenhull yes, this is a priority label. The issue has been triaged, but not yet assigned. We'll have the milestone updated once this is assigned to a sprint.

fabiocav avatar Feb 20 '20 22:02 fabiocav

@fabiocav: this really sounds like a .net core bug/issue. can you confirm?

i'm running into something similar using a vanilla asp.net core service.

daniel-white avatar Mar 26 '20 20:03 daniel-white

@fabiocav : Not to blame anyone, but how can a basic feature like Scoped or not being labeled as a p2 ? This seems like a rather basic functionality ? Can you point us to workarounds to resolve this issue ?

bartdk-be avatar Mar 31 '20 14:03 bartdk-be

This appears to be a problem on at least .NET Core 2.1, but is fixed in 3.x. I've raised it here: dotnet/runtime#34586

daniel-white avatar Apr 06 '20 13:04 daniel-white

@daniel-white Any pointer on where it should be fixed in 3.1 ? Because we faced a similar issue in 3.1 (using the functions DI setup). Thanks !

bartdk-be avatar Apr 07 '20 17:04 bartdk-be

@bartdk-be: im not sure - in my miminal example at dotnet/runtime#34586 make sure your Microsoft.Extensions.Http is the latest version.

daniel-white avatar Apr 07 '20 17:04 daniel-white

Hi @t-l-k ... I guess you don't have a work around then? Same significant problem here - fighting with it for months. It seems to happen on all but trivial systems - certainly with three of our serverside blazor systems with between 30 and 50 services. We are thinking of writing a state memory solution - creating a "StateMemoryService" as a singleton and having it register the instances of each other services against the transaction's SessionId. An in-memory register of <sessionId, serviceType, instanceId>. It's a LOT of effort for what is essentially a bug. @fabiocav @daniel-white

rodhemphill avatar Apr 13 '20 01:04 rodhemphill

Hey @fabiocav, if you are up for fixing this I can help. I have three client projects that have the problem and I can easily produce a situation that doesn't get the problem. I can't give you client repos but I have a testing repo that I can share with you. The problem is easy to reproduce and demonstrate in that repo. This repo includes EF and various service classes, some of which cascade inject other services ... which are the most likely candidates for the problem. Whatever causes it the symptoms are that all "scoped" services get translated to "transient".

melbappdev avatar Apr 14 '20 00:04 melbappdev

@rodhemphill sticky tape and plasters I'm afraid. Basically failing requests and relying on the runtime to execute the request again with fingers crossed for subsequent attempts. We detect the problem in constructors of classes we know should be scoped, and thus when two instances are created in a logical execution scope (determined by AsyncLocal) we throw an exception and log it. Makes a mess of my logs - see https://github.com/Azure/azure-functions-host/issues/4914#issuecomment-601251057.

It caused quite an embarrasing false start to our UAT phase for sure (deadlines and all), customer was quite alarmed at the amount of failing transactions, and that was as I was chasing it here to get it fixed. I had to refactor in the end, as is the advice per @fabiocav. even changing the order of parameters in constructors reduced the pain! It's still failing on some non-critical code paths, but we can live with that.

Agree- anything non-trivial causes it to fail. We built a polyglot solution, but one component was full DDD stack based on Microsoft's own patterns advice due to the domain complexity. Our simpler Functions Apps (just plain HTTP and read/write storage) work without issue.

We've dropped functions from our list of candidate dotnet technologies for future products & projects in favour of plain ASP.NET Core until this is fixed. Obviously we lose the serverless benefits.

Still, I managed to get money back from Azure for the development headaches it caused, as it is a GA product. If you're suffering the same I would recommend you to also raise support requests. Of course, it didn't come close to covering the amount of expense, inconvenience, and damage to our reputation. I had a very professional support experience with Microsoft and their Functions support team were amazing. But they couldn't fix the underlying issue - that's clearly an engineering problem.

Perhaps they might treat this more seriously if having to refund more customers via support requests ......... 😉

t-l-k avatar Apr 18 '20 13:04 t-l-k

I have had all of my teams stopped all .Net Core AzFunc development, this Scoped DI issue was the final straw for me. If basic .Net Core DI dose not work who knows what other problems may show up at the wrong time, the risk is too great.

May be send a message/tweet to Damian Edwards @DamianEdwards and/or Scott Hanselman @shanselman about your AzFunc development experience.

espray avatar Apr 19 '20 02:04 espray

I think the thing that bothers me about this issue is that it has been around in some form or another for like a year and a half. The original bug was a two line fix, and I imagine the issue we are all facing currently is similar. Can we please get an update from engineering?

AnthonyIacono avatar Apr 19 '20 03:04 AnthonyIacono

We too have had to stop using @azfunc as the scoping issue makes them unusable for our solution. This is very frustrating for a team that have embraced Azure functions from day one. I’ve been very disappointed with how long this issue has taken to be resolved @fabiocav , oh wait it’s still active! We too are seeking money back on our subscriptions.

darrenhull avatar Apr 19 '20 06:04 darrenhull

I have wasted about 4 days solid on this issue and with impending delivery dates and (soon to be) an unhappy client ... I am glad you responded @t-l-k as you've saved me a lot more wasted time. Yes I agree @darrenhull that it sounds like it has taken too long to fix, but a more important question is: what has it cost? How many people like me have fallen into this hole? @fabiocav

melbappdev avatar Apr 19 '20 06:04 melbappdev

This issue is a huge blocker and addressing it has to be expedited. If there's a P1 label indicating a higher priority than P2, that one should be applied instead of P2. We cannot proceed with our project gracefully without a fix! @fabiocav

Arash-Sabet avatar Apr 23 '20 18:04 Arash-Sabet

Assigning this to the next sprint. @Arash-Sabet, if you can open an issue detailing your scenario, we might be able to assist with a workaround until this is resolved.

fabiocav avatar Apr 23 '20 20:04 fabiocav