Azure-Functions icon indicating copy to clipboard operation
Azure-Functions copied to clipboard

Possible thread pool starvation detected

Open ThomasArdal opened this issue 4 years ago • 67 comments

I would have created this under Discussions but it doesn't seem to work. I think categories are missing to be allowed to create a new discussion thread.

I'm starting to see the following warning in a log for a consumption-based function:

Possible thread pool starvation detected

I'm not sure why this problem occurs since my understanding of the consumption plan is that Azure Functions will scale the function to avoid problems like this.

The function is triggered by Azure Service bus and make a HTTP request for each message received from the bus. The HttpClient used to make the request is registered in a Startup.cs file:

builder
    .Services
    .AddHttpClient("myclient", x =>
    {
        x.Timeout = TimeSpan.FromSeconds(30);
        x.DefaultRequestHeaders.ConnectionClose = true; // Keep-Alive = false
        x.DefaultRequestHeaders.AcceptEncoding.Clear();
        x.DefaultRequestHeaders.AcceptEncoding.Add(new StringWithQualityHeaderValue("gzip"));
        x.DefaultRequestHeaders.AcceptEncoding.Add(new StringWithQualityHeaderValue("deflate"));
        x.DefaultRequestHeaders.AcceptEncoding.Add(new StringWithQualityHeaderValue("identity"));
    })
    .ConfigurePrimaryHttpMessageHandler(() => new HttpClientHandler
    {
        SslProtocols = System.Security.Authentication.SslProtocols.None,
    });

Then, in my function I inject a IHttpClientFactory and use that to create the HttpClient in each invocation:

public MyFunction(IHttpClientFactory httpClientFactory)
{
    this.httpClientFactory = httpClientFactory;
}

[FunctionName("MyFunction")]
public async Task Run([ServiceBusTrigger(...)]string mySbMsg, CancellationToken token)
{
    if (token.IsCancellationRequested) return;

    ...
    var httpClient = httpClientFactory.CreateClient("myclient");
    var request = new HttpRequestMessage();
    request.Method = HttpMethod.Get;
    request.RequestUri = ...;
    var response = await httpClient.SendAsync(request);
    ...
}

Any help here would be highly appreciated.

ThomasArdal avatar Sep 24 '21 09:09 ThomasArdal

Hi @ThomasArdal , Here is a document explaining the cause and possible solutions - https://github.com/microsoft/vs-threading/blob/main/doc/threadpool_starvation.md

Here is a blog explaining how to diagnose the issue - https://docs.microsoft.com/en-us/archive/blogs/vancem/diagnosing-net-core-threadpool-starvation-with-perfview-why-my-service-is-not-saturating-all-cores-or-seems-to-stall

Please let us know if you were still unable to resolve the error

v-anvari avatar Sep 27 '21 13:09 v-anvari

I'm getting the same error as well

2021-09-27T14:23:53Z [Verbose] [HostMonitor] Worker status: ID=0a923147-f53b-4909-9483-3ed0aa018971, Latency=45351ms 2021-09-27T14:23:53Z [Verbose] [HostMonitor] Host process CPU stats (PID 5148): History=(95,93,92,91,93), AvgCpuLoad=92.8, MaxCpuLoad=95 2021-09-27T14:23:53Z [Verbose] [HostMonitor] Host process CPU stats (PID 3028): History=(3,0,1,2,1), AvgCpuLoad=1, MaxCpuLoad=3 2021-09-27T14:23:53Z [Verbose] [HostMonitor] Host aggregate CPU load 94 2021-09-27T14:23:53Z [Warning] [HostMonitor] Host CPU threshold exceeded (94 >= 80) 2021-09-27T14:23:53Z [Warning] Possible thread pool starvation detected.

My function is runing in Nodejs, this issue didn't appear couple of weeks ago

exhaler avatar Sep 27 '21 14:09 exhaler

Same here. I first noticed this in the log on Friday, September 17, 2021 07:14:51. Nothing changed around that time on my part.

ThomasArdal avatar Sep 28 '21 11:09 ThomasArdal

@exhaler Did you by any chance disable the health monitor in your host.json file?

ThomasArdal avatar Sep 28 '21 14:09 ThomasArdal

Nope, the host.json hasn't changed. Only some app insights trackEvent were added

exhaler avatar Sep 28 '21 14:09 exhaler

Hi @ThomasArdal , Can we have the function invocation id, region name, and timestamp.

v-anvari avatar Sep 28 '21 15:09 v-anvari

I don't think I can provide an invocation id. At least an invocation id is not logged alongside the log message. I seems to come through Microsoft.Extensions.Logging but not as part of any invocation.

I have multiple instances of this logged. Here are a recent one:

Time: 2021-09-28T15:08:50.606136Z Region: North Europe

It's happening in West US, Brazil, etc. too. But only for consumption-based functions. None of my app service-based functions are logging this warning.

ThomasArdal avatar Sep 28 '21 17:09 ThomasArdal

Hi @ThomasArdal , Any invocation id would be sufficient, We require the invocation id to reach your App logs

v-anvari avatar Sep 30 '21 09:09 v-anvari

@v-anvari I see. Here's an invocation id from North Europe: 4dd84024-8b2b-4d0c-9fe2-328ea326d1cd

ThomasArdal avatar Sep 30 '21 09:09 ThomasArdal

Thank you for the information. We shall further investigate and update as appropriate

v-anvari avatar Sep 30 '21 09:09 v-anvari

Hi @ThomasArdal , I have gone through the logs, and I see that on 17 sep the instance picked up a new host version, to confirm if the issue is with the host version, can you pin the runtime 3.1.4 and let us know if you still see any issue. This will let us conclude if its because of the version.

Secondly, you can do a quick analysis by pulling the memory dump to check if any object is stuck in the pool which is causing any delays https://docs.microsoft.com/en-us/answers/questions/401671/how-to-capture-memory-dump-of-consumption-plan-fun.html

v-anvari avatar Oct 07 '21 16:10 v-anvari

@v-anvari Sure, I will try to ping the host. Do you have any documentation on how to do that?

ThomasArdal avatar Oct 07 '21 17:10 ThomasArdal

Ah sorry. I think I found it here: https://docs.microsoft.com/en-us/azure/azure-functions/set-runtime-version?tabs=portal.

I have the same function app deployed to two different regions. I'll try to set FUNCTIONS_EXTENSION_VERSION to ~3 on one of them and 3.1.4 on the other one. And observe the logs from both apps to see if they both still produce the warning. Will let you know about my findings.

ThomasArdal avatar Oct 07 '21 18:10 ThomasArdal

@v-anvari I think it's pretty safe to say that the problem disappears when pinning 3.1.4. Since yesterday, the warning has been logged three times on the function app running ~3 and zero times on the function app running 3.1.4. The function app running 3.1.4 was previously logging the warning multiple times per day.

I'll have to pin all of my function apps to 3.1.4 for now :( Thank you for your help. Would love some more information if you figure out if this is a real problem or not.

ThomasArdal avatar Oct 08 '21 13:10 ThomasArdal

Update: I have pinned all of my function apps to 3.1.4 now and haven't seen the warning since.

ThomasArdal avatar Oct 11 '21 06:10 ThomasArdal

Thank you @ThomasArdal , for bringing this to our attention. We shall investigate further on the version. But, we have a new version with fixes, can you use the latest version 3.3 and let us know if you face any challenges. Parallelly, we will investigate on our end as well.

v-anvari avatar Oct 11 '21 12:10 v-anvari

@v-anvari There's both a 3.3.0 and 3.3.1 available. Which one do you recommend me to try out?

ThomasArdal avatar Oct 11 '21 12:10 ThomasArdal

You can use either one 3.3.1. Apparently The problem was with the version 3.2

v-anvari avatar Oct 11 '21 13:10 v-anvari

@v-anvari i'm running into the same issue. I have FUNCTIONS_EXTENSION_VERSION set to ~3 will it be automatically using the new 3.3.1 version? Or should I change it to ~3.3

exhaler avatar Oct 11 '21 13:10 exhaler

@exhaler yes, when you use tilde "~", it should be using the latest version, Can you provide the invocation id, timestamp and region

v-anvari avatar Oct 11 '21 13:10 v-anvari

@v-anvari region Central US, invocation id: 8f6f818d-3d2c-4dff-9903-86cd963e2127, timestamp [UTC] | 2021-10-07T05:10:40.4240012Z

exhaler avatar Oct 11 '21 13:10 exhaler

Hi @exhaler , Apparently the functions haven't picked up the latest versions. As of now your function is running with 3.2.x. The following documentation shows how to pin the version https://docs.microsoft.com/en-us/azure/azure-functions/set-runtime-version?tabs=portal. It could be possible that function is taking the latest stable version. For now, you can use 3.1 until further investigation

v-anvari avatar Oct 11 '21 13:10 v-anvari

@exhaler I'm currently running 3.1.4 on half of my apps and 3.3.1 on the other half. I'll write an update here tomorrow (if I remember it).

ThomasArdal avatar Oct 11 '21 13:10 ThomasArdal

@v-anvari thank you for the update, I pinned it to ~3.1 and the warnings have greatly reduced, they still show up but not as frequent as before on 3.2. Any idea when will the newest version 3.3 be picked up on my azure functions?

exhaler avatar Oct 11 '21 15:10 exhaler

Quick update. None of my function apps have reported the warning since yesterday. I'll pin all of them to 3.3.1. I think this will be my process going forward where I always pin to a specific version. When a new version is released, I'll test this on a single non-critical app and rollout the new pinned version only when I feel confident that the new version is working well 👍

ThomasArdal avatar Oct 12 '21 07:10 ThomasArdal

Thankyou @ThomasArdal

@exhaler , While pinning, please don't add the ~.

v-anvari avatar Oct 12 '21 13:10 v-anvari

@v-anvari I think the pin there is for any patch version of 3.1. So, ~3.1 will allow automatic upgrades to 3.3.1, 3.3.2, etc. but not 3.4.0. Or am I misunderstanding how this works?

ThomasArdal avatar Oct 12 '21 13:10 ThomasArdal

FYI @pragnagopa

v-anvari avatar Oct 12 '21 13:10 v-anvari

@ThomasArdal , I am not sure if it works that way, to pin to a specific version, ~ is not used. As mentioned here https://docs.microsoft.com/en-us/azure/azure-functions/functions-app-settings#functions_extension_version

v-anvari avatar Oct 12 '21 13:10 v-anvari

@v-anvari I pinned it at 3.1 but i'm still getting the warning but not as often. image Additionally going over the logs, the first warning started showing up on Sept 18

exhaler avatar Oct 12 '21 13:10 exhaler