azure-functions-host
azure-functions-host copied to clipboard
ITelemetryProcessor does not appear to be supported in Function Apps
From @StevenTaylor on October 5, 2018 5:50
Documentation (https://docs.microsoft.com/en-us/azure/application-insights/app-insights-api-filtering-sampling) suggests that ITelemetryProcessor can be used to filter\sample the telemetry before it is sent App Insights.
But the implementation code is never called in our Function App.
The code to set up the builder is called, which looks like:
var builder = TelemetryConfiguration.Active.TelemetryProcessorChainBuilder; builder.Use((next) => new AppInsightsFilter(next)); builder.Build();
But the 'Process' method implemented in the class is never called, the class looks like:
` class AppInsightsFilter : ITelemetryProcessor { private ITelemetryProcessor Next { get; set; }
// Link processors to each other in a chain.
public AppInsightsFilter(ITelemetryProcessor next)
{
this.Next = next;
}
public void Process(ITelemetry item)
{
// To filter out an item, just return
if (!OKtoSend(item)) { return; }
// Modify the item if required
//ModifyItem(item);
this.Next.Process(item);
}
private bool OKtoSend(ITelemetry item)
{
// try something!
return (DateTime.Now.Second <= 20); // only send for the first 20 seconds of a minute
}
}`
PS. We have tried setting config values SamplingPercentage (App Insights config) and maxTelemetryItemsPerSecond (host.json) as low as possible to reduce the telemetry data, but there is still too much telemetry.
Copied from original issue: Azure/Azure-Functions#981
From @roryprimrose on November 2, 2018 4:13
This is a big issue for me as well. The amount of trace records being written to AI is large and increases the cost. Being able to filter out the unnecessary entries would save a lot of bandwidth and cost.
Curious on if work around dependency injection ( azure/azure-functions-host#3736 ) will help with this or anything else? @brettsam as well as he has tons of AI experience
My scenario is that I want to remove trace messages written to AI, primarily from the storage extension while allowing custom metrics to be sent to AI. I've tried just about everything to get a resolution to this without success. I'm also using DI for my C# functions.
Here are some of the things I've found:
-
I get a TelemetryClient from the host in an extension at the first available opportunity. I modify it as per the code from @StevenTaylor. I found that the client did appear to have the processors registered, but the code was never executed.
-
I tried disabling AI logging Information entries in config
"Logging": {
"LogLevel": {
"Default": "Information"
},
"ApplicationInsights": {
"LogLevel": {
"Default": "Warning"
}
}
}
This wiped out all Information tracing to AI. I would prefer to be more targeted than that, but it solved the first problem. Unfortunately it also wiped out reporting of customMetrics to AI. I suspect this is because customMetric records are written with a trace level of Information. This was a little weird though because on reflection I realised I was sending custom metrics directly from the TelemetryClient rather than via any injected ILogger.
- I thought about the configuration above and tried using a new blank TelemetryClient for reporting custom metrics. I hoped this would not pick up the logging configuration filter. Weirdly, my new TelemetryClient also could not report custom metrics with the above configuration in play.
This is a great scenario so it'd be interesting to walk through it here. and I can build these up on the wiki to help everyone else, too.
I'm guessing that the processor isn't working b/c the processor chain gets built here: https://github.com/Azure/azure-webjobs-sdk/blob/dev/src/Microsoft.Azure.WebJobs.Logging.ApplicationInsights/Extensions/ApplicationInsightsServiceCollectionExtensions.cs#L270. Once that happens, I believe it's immutable. If we wanted to allow custom processors, we'd need to expose some other builder mechanism that allowed you to plug in before we build. But we may be able to solve this with filters.
What are the specific messages that you're trying to silence? You can filter anything based on the category (or even the category prefix). So if you know what you want to let through and what you want to prevent, we should be able to craft the right filter. To see the category of a particular message, check the customDimensions.Category
value in App Insights Analytics
@brettsam Yes, the categories between traces and customMetrics are different. I'll try using a config filter against the traces category.
@brettsam Nope, sorry, I was right the first time. My custom metrics, written directly via TelemetryClient, are being recorded with the same category used to write trace records from the queue trigger.
I have a queue trigger AnalyticsHintsProcessor
which logs with the category Function.AnalyticsHintsProcessor
. I also have a timer trigger AnalyticsScavenger
which logs with the category Function.AnalyticsScavenger
.
These are the customMetrics that are recorded for both scenarios.
These are the trace records written by the triggers.
So I can't use a logging config filter to wipe out the trace records for the storage triggers without also wiping out the custom metrics.
Why is TelemetryClient coupled to ILogger configuration filters?
The OOTB metrics for the function are the ones that use a different category being Host.Aggregator
.
Can you share how you're logging your metrics? We've added a LogMetric()
extension to ILogger
that log metrics to App Insights with a Category of Function.FunctionName.User
-- which is how we indicate logging coming from within your code. You could then filter based on that, which would remove all of those host logs you're seeing (the ones without .User
)
I'm not using the ILogger.LogMetric
extension because I wanted to do pre-aggregation of my custom metrics. I'm using some code that was originally on MSDN somewhere (for the life of me I can't find it now). The logic calculates aggregated metric data for a period of time before sending the metrics to AI.
For example, I have some processes where the metric data can be aggregated to one or five minute blocks of data where hundreds of individual metrics could be tracked within those periods. This saves a lot of raw metrics being sent to AI.
As far as getting the TelemetryClient
in this scenario, this is how I hook it up in a function. This is part of my custom dependency injection configuration.
[assembly: WebJobsStartup(typeof(DependencyInjectionStartup))]
public class DependencyInjectionStartup : IWebJobsStartup
{
public void Configure(IWebJobsBuilder builder)
{
builder.AddExtension<InjectExtensionConfigProvider>();
}
}
public class InjectExtensionConfigProvider : IExtensionConfigProvider
{
private readonly InjectBindingProvider _bindingProvider;
public InjectExtensionConfigProvider(TelemetryClient telemetryClient, ILoggerFactory loggerFactory)
{
Ensure.Any.IsNotNull(telemetryClient, nameof(telemetryClient));
Ensure.Any.IsNotNull(loggerFactory, nameof(loggerFactory));
_bindingProvider = new InjectBindingProvider(telemetryClient, loggerFactory);
}
public void Initialize(ExtensionConfigContext context)
{
context.AddBindingRule<InjectAttribute>().Bind(_bindingProvider);
}
}
The InjectBindingProvider
class holds an Autofac container which is configured with the TelemetryClient
and ILoggerFactory
values. My custom metric aggregation logic is created by the container using the TelemetryClient
.
I have a related problem so I thought I'd write about it here rather than make a new issue. In my function app, there are many calls to dependent services. These calls are all tracked as Dependencies in Application Insights. The vast majority of data in my Application Insights instances are from successful dependency calls. The cost of tracking these calls has become substantial.
I've tried implementing the ITelemetryProcess interface to filter them out, and then later found here that this doesn't work.
Is there some other way to disable Dependency Tracking in Application Insights for Function Apps? Currently, my two options are 1) Pay a substantial amount of money to needlessly track dependency successes, or 2) get rid of Application Insights.
I am also having this same problem. I have been trying to filter our Dependency Tracking as it's costing a large sum of money for data I don't really need.
@michaeldaw hit the nail on the head. We're having the exact same issue and would love to be able to reduce our AI bill by filtering out records. Particularly filtering out successful dependency calls would be a big positive.
For now we're going to have to replace our logging implementation to write useful data to another logging service until this gets some traction.
@brettsam do you have a good feel for what the right feature / fix is that could resolve this? Seems to be a fairly common type ask and not sure I'm clear on exactly if just a doc gap, feature gap, or something that could be solved with other in flight work.
I apologize for the delay -- I lost track of this issue. Looping in @lmolkova for her thoughts as well.
For the core issue filed -- registering a custom ITelemetryProcessor
should be do-able and fairly easy. You could register one or more with DI and we could stick it at the end of the processor chain, after all of our built-in processors run. That shouldn't be too difficult and as long as we clearly document where it runs in the chain, I think it makes sense. It would allow a lot of what I've listed out below to be done in a custom way by anyone.
There's a couple of other issues brought up that are good as well:
- @roryprimrose: looking back up at this comment, I think I see the issue. You're using the same
TelemetryClient
that we create via DI, which runs a customITelemetryInitializer
. This will grab the current logging scope and apply it to the telemetry being logged. Because you're doing this in a background task, that likely explains why you're getting an incorrect category. If you, instead, created your ownTelemetryClient
there (using the same instrumentation key -- which can be pulled from an injectedIConfiguration
) -- you should get category-less metrics that you can log however you want. - Although not directly brought up, it may make sense to log metrics with a different category -- like
Function.FunctionName.Metric
to allow for more granular filtering. They currently get lumped in asFunction.FunctionName.User
, which is workable but may not give enough switches for everyone. - @michaeldaw / @KonnectSalon / @jamesdheadup: Can you share what the customDimentions.Category is for your dependencies that you're seeing? I'm going through how we "tag" DependencyTelemetry right now and seeing a few things we can improve (I'll move these to a separate issue but wanted to enumerate them here):
- If the dependency fails, we don't flip the
LogLevel
toError
, which we should. That would allow you to filter and only show failed dependencies. - We probably should add a blanket on/off for dependency auto-tracking.
- There are scenarios where the category will be
Host.Bindings
and scenarios where the category will beFunction.FunctionName
. We should try to reconcile this for better filtering. - We should think about whether we want to have some threshold for slow dependencies as well -- i.e. only log (as
Warning
?) ones that take longer than 1 second (configurable).
- If the dependency fails, we don't flip the
@brettsam For me, the category seems to always be Function.FunctionName. I don't see any categorized as Host.Bindings.
Here is a code sample that demonstrates how to add a processor to the chain. We should probably simplify this and enable the same approach as we have for TelemetryIntializers
using System.Linq;
using FunctionApp12;
using Microsoft.ApplicationInsights.Channel;
using Microsoft.ApplicationInsights.DataContracts;
using Microsoft.ApplicationInsights.Extensibility;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Hosting;
using Microsoft.Extensions.DependencyInjection;
[assembly: WebJobsStartup(typeof(Startup))]
namespace FunctionApp12
{
public class Startup : IWebJobsStartup
{
public void Configure(IWebJobsBuilder builder)
{
var configDescriptor = builder.Services.SingleOrDefault(tc => tc.ServiceType == typeof(TelemetryConfiguration));
if (configDescriptor?.ImplementationFactory == null)
return;
var implFactory = configDescriptor.ImplementationFactory;
builder.Services.Remove(configDescriptor);
builder.Services.AddSingleton(provider =>
{
if (!(implFactory.Invoke(provider) is TelemetryConfiguration config))
return null;
config.TelemetryProcessorChainBuilder.Use(next => new AiErrorFilteringProcessor(next));
config.TelemetryProcessorChainBuilder.Build();
return config;
});
}
}
internal class AiErrorFilteringProcessor : ITelemetryProcessor
{
private readonly ITelemetryProcessor _next;
public AiErrorFilteringProcessor(ITelemetryProcessor next)
{
_next = next;
}
public void Process(ITelemetry item)
{
if (!(item is TraceTelemetry trace && trace.Message == "AI (Internal): Current Activity is null for event = 'System.Net.Http.HttpRequestOut.Stop'"))
{
_next.Process(item);
}
}
}
}
- If the dependency fails, we don't flip the
LogLevel
toError
, which we should. That would allow you to filter and only show failed dependencies.
Agree on this, but the failure criteria problem would immediately arise after that (are 404, 401, 409, 429 failures).
Normally we approach this problem by sampling: you either want everything for the particular transaction to be sampled in or nothing at all. If you don't track a dependency call, other features may be broken like end-to-end transaction viewer or app map. Even though dependency call is successful, you might still be interested in what happened for the whole transaction.
So I'd suggest to carefully weight everything before deciding to disable successful dependency tracking.
Please factor the following while weighing the pros and cons of tracking successful dependencies:
These are screenshots of the Application Insights instance and cost analysis for a particular Function App and related storage account in our system.
Above is a capture of the default 24-hour period search for the service in question. You can see that dependency tracking accounts for 1.4 million items, while Trace, Request, and Exception account for 40K, 18K, and 1.9K (really must look in to those exceptions), respectively. Dependency events account for approximately 96% of all events.
This is the cost projection of the Application Insights instance. As before, the image shows that "REMOTEDEPENDENCY" events make up for the vast majority of recorded events.
Finally, the above screenshot is a filtered selection from the "Cost by Resource" view, showing the cost of the Function App, Storage Accounts, and Application Insights instances in question. The cost of the Application Insights instance is 1252% that of the cost of the Function App it is monitoring.
These costs are woefully unsustainable for us. Of late, my decision to use Function Apps, which I'd touted to my colleagues as extremely cost effective, is being called in to question by my teammates and superiors. Application Insights has been an invaluable tool for diagnostics and troubleshooting. I'd liken using a Function App without an associated Application Insights instance to flying by the seat of my pants. That said, I will eventually have to choose to either stop using Application Insights, or stop using Function Apps. I'm sure I'm not the only one who would really appreciate if the Functions and Application Insights teams could find a solution by which that choice doesn't have to me made.
What @lmolkova has above will work -- it lets you grab the TelemetryConfiguration that we generate and append an ITelemetryProcessor to it. From there, you can filter out any DependencyTelemetry however you want. Thanks for putting that together @lmolkova! That at least gives folks an escape hatch while we work through the design here. We'll keep this issue open and work to come up with better configuration and extensibility for this.
And thanks a ton @michaeldaw (and others) -- having all this data makes it really easy to see how important this is.
A couple of other notes to anyone trying the DI approach above (neither of these will be required long-term, but are right now):
- Make sure to reference Microsoft.ApplicationInsights 2.7.2 in your project. Anything newer will end up with type mismatches.
- Add a direct nuget reference to version 1.0.2 of https://www.nuget.org/packages/Microsoft.Azure.WebJobs.Script.ExtensionsMetadataGenerator. It fixes an issue when targeting .NET Core 2.1+ and using
WebJobsStartup
.
@brettsam @lmolkova Thanks for looking in to this, everyone. I've implemented the code @lmolkova posted above. Strangely, the code seems to work as expected when I debug the function app locally, but the filtering does not occur when I publish the app to Azure. I was hoping someone might have some suggestions as to why. I know there are Application Insights configuration options available in the host.json file, but I don't have anything there except the "version": "2.0" statement. I also do not have an ApplicationInsights.config file. Is there something else I might be missing?
Here's my implementation of ITelemetryProcessor specifically for filtering out DependencyTelemetry items:
internal class AiDependencyFilter : ITelemetryProcessor
{
private readonly ITelemetryProcessor _next;
public AiDependencyFilter(ITelemetryProcessor next)
{
_next = next;
}
public void Process(ITelemetry item)
{
var request = item as DependencyTelemetry;
if (request?.Name != null)
{
return;
}
_next.Process(item);
}
}
What version of the host are you running locally? Can you check the bin\extensions.json
file both locally and in Azure?
Azure Functions Core Tools (2.4.379 Commit hash: ab2c4db3b43f9662b82494800dd770698788bf2d)
Function Runtime Version: 2.0.12285.0
I wasn't aware of the extensions.json file, but it looks like that was the problem. Here's the local extensions.json file. The "Startup" class mentioned is the one I added that replicates the functionality @lmolkova mentioned.
{
"extensions":[
{ "name": "DurableTask", "typeName":"Microsoft.Azure.WebJobs.Extensions.DurableTask.DurableTaskWebJobsStartup, Microsoft.Azure.WebJobs.Extensions.DurableTask, Version=1.0.0.0, Culture=neutral, PublicKeyToken=014045d636e89289"},
{ "name": "AzureStorage", "typeName":"Microsoft.Azure.WebJobs.Extensions.Storage.AzureStorageWebJobsStartup, Microsoft.Azure.WebJobs.Extensions.Storage, Version=3.0.2.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"},
{ "name": "Startup", "typeName":"Scadavantage.Startup, Scadavantage.Data, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null"}
]
}
The file in Azure did not have that last line mentioning the Startup class. I deploy to this Function App from Azure DevOps. Something there must ignore the file when deploying. I can look further in to that. I added the last line manually using Kudu.
With that last line in place, it seems to work! I'm not seeing Dependency calls in the either the Live Metrics Stream, or when I do a search for events. What a relief!
I wanted to mention that with this filter in place, the Live Metrics view seems to be adversely affected. I no longer see vales in the Request Rate or Request Duration graphs in the Live Metrics view. Could this be related to the concerns raised by @lmolkova about related events being affected by the absence of dependency events? Requests and traces still appear fine in the Search section.
In any event, I'm very much willing to live with a diminished experience with the Live Metrics Stream if it means so drastically reducing our costs. Thank you very much to everyone who's been looking in to this! It's a huge help.
Not sure if it still matters, but here's my local host version:
Azure Functions Core Tools (2.3.199 Commit hash: fdf734b09806be822e7d946fe17928b419d8a289)
Function Runtime Version: 2.0.12246.0
Can you share the package references in your .csproj? And what is your TargetFramework? I'm seeing several reports of build servers incorrectly generating the extensions.csproj so I'm trying to narrow it down.
Absolutely. The target framework is .NET Core 2.1. Here are the package references from the Function App:
<ItemGroup>
<PackageReference Include="Microsoft.ApplicationInsights" Version="2.7.2" />
<PackageReference Include="Microsoft.Azure.WebJobs.Extensions.DurableTask" Version="1.7.1" />
<PackageReference Include="Microsoft.Azure.WebJobs.Extensions.Storage" Version="3.0.2" />
<PackageReference Include="Microsoft.Azure.WebJobs.Script.ExtensionsMetadataGenerator" Version="1.0.2" />
<PackageReference Include="Microsoft.Extensions.DependencyInjection" Version="2.1.1" />
<PackageReference Include="Microsoft.NET.Sdk.Functions" Version="1.0.24" />
<PackageReference Include="Scadavantage.Common.Core" Version="1.0.0-prerelease067" />
<PackageReference Include="Twilio" Version="5.26.1" />
</ItemGroup>
It mentions a "scadavantage.common.core". Here are the package references from that package:
<ItemGroup>
<PackageReference Include="Microsoft.ApplicationInsights" Version="2.7.2" />
<PackageReference Include="Microsoft.AspNetCore" Version="2.1.7" />
<PackageReference Include="Microsoft.Azure.DocumentDB.Core" Version="2.2.2" />
<PackageReference Include="System.Security.Cryptography.Algorithms" Version="4.3.1" />
<PackageReference Include="WindowsAzure.Storage" Version="9.3.3" />
</ItemGroup>
Let me know if I can help further.
Thanks that helped confirm the problem. I just opened this issue (feel free to comment over there so we don't derail this issue :-)) https://github.com/Azure/azure-functions-vs-build-sdk/issues/277
@michaeldaw
thanks for the great write up! Now I understand your motivation for disabling dependencies better. This is something we should eventually handle in the ApplicationInsights SDK (for Functions and any other applications) and we need to find the right approach for it. I'll start this discussion internally.
Not tracking successful calls to the majority of bindings (e.g. tables) is reasonable and would not create any obvious issues with UX as long as these dependencies are leaf nodes.
Not tracking http calls to your own services or some output bindings (e.g queues) would break end-to-end tracing. Think about transaction traces as a tree. If one node is missing, reference is lost and instead of one tree, we now have two. We still know they are correlated, but causation is lost. In some cases, it is still good enough and for sure much better than the costs associated with redundant data.
As @brettsam mentioned (https://github.com/Azure/azure-webjobs-sdk/issues/2123), we'll provide better configuration to turn off the collection.
I'll also check why sampling was not useful for original issue author, maybe something is broken here:
PS. We have tried setting config values SamplingPercentage (App Insights config) and maxTelemetryItemsPerSecond (host.json) as low as possible to reduce the telemetry data, but there is still too much telemetry.
@lmolkova: thank you for explaining this. In our case, the functions in question are HTTP-triggered. The dependencies in questions are, for the most part, calls to tables using the .net storage API. It sounds like these kinds of calls would fall under the second part of your explanation (http calls, etc.).
I've modified the original AiDependencyFilter
class that I posted earlier to allow failed dependency calls through the filter. I've only tested it a little bit, but it seems to restore the end-to-end tracing at least for those function calls. This is entirely sufficient for my purposes. It's really only when the calls fail that I'm interested in the tracing.
Thank you for your help.
@brettsam i am also running into a situation where I want to grab a reference to the TelemetryClient created by the runtime. Is there another way to get a hold of the instance without hooking into a custom IExtensionConfigProvider? it's my understanding without any custom bindings the extension would never be loaded. Here is what I have so far (i have a startup as well)
namespace Infrastructure.Logging
{
using DryIoc;
using Microsoft.ApplicationInsights;
using Microsoft.Azure.WebJobs.Host.Config;
using System;
public class TelemetryClientExtensionConfigProvider : IExtensionConfigProvider
{
private readonly TelemetryClient telemetryClient;
public TelemetryClientExtensionConfigProvider(TelemetryClient telemetryClient)
{
this.telemetryClient = telemetryClient ?? throw new ArgumentNullException(nameof(telemetryClient));
}
public void Initialize(ExtensionConfigContext context)
{
Main.Container.UseInstance(telemetryClient);
}
}
}
@michaeldaw The Live Metrics Stream will stop working when you simply replace TelemetryConfiguration
in IWebJobsBuilder.Services. It is because when you call TelemetryProcessorChainBuilder.Build()
QuickPulseTelemetryProcessor
will be recreated, so you need have to register it again.
Here is my solution.
Add Microsoft.ApplicationInsights.PerfCounterCollector
v2.7.2 through NuGet
var quickPulseProcessor = config.TelemetryProcessors
.OfType<QuickPulseTelemetryProcessor>()
.FirstOrDefault();
if (!(quickPulseProcessor is null))
{
var quickPulseModule = provider
.GetServices<ITelemetryModule>()
.OfType<QuickPulseTelemetryModule>()
.FirstOrDefault();
quickPulseModule?.RegisterTelemetryProcessor(quickPulseProcessor);
}
config.TelemetryProcessors
.OfType<ITelemetryModule>()
.ToList()
.ForEach(module => module.Initialize(config));
@mpaul31 -- have you tried putting a TelemetryClient
in your function class's constructor? It should be injected there as TelemetryClient
is a service registered with DI.
@dzhang-quest Thank you very much for posting that code. It's working great. That was really the final piece of the puzzle. I'd thought I'd post the complete class in case anyone is interested. The programmer should be able to just add this class to their Azure Function project.
using MyNamespace;
using System.Linq;
using Microsoft.ApplicationInsights.Channel;
using Microsoft.ApplicationInsights.DataContracts;
using Microsoft.ApplicationInsights.Extensibility;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Hosting;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.ApplicationInsights.Extensibility.PerfCounterCollector.QuickPulse;
[assembly: WebJobsStartup(typeof(Startup))]
namespace MyNamespace
{
public class Startup : IWebJobsStartup
{
public void Configure(IWebJobsBuilder builder)
{
var configDescriptor = builder.Services.SingleOrDefault(tc => tc.ServiceType == typeof(TelemetryConfiguration));
if (configDescriptor?.ImplementationFactory == null)
return;
var implFactory = configDescriptor.ImplementationFactory;
builder.Services.Remove(configDescriptor);
builder.Services.AddSingleton(provider =>
{
if (!(implFactory.Invoke(provider) is TelemetryConfiguration config))
return null;
config.TelemetryProcessorChainBuilder.Use(next => new AiDependencyFilter(next));
config.TelemetryProcessorChainBuilder.Build();
var quickPulseProcessor = config.TelemetryProcessors
.OfType<QuickPulseTelemetryProcessor>()
.FirstOrDefault();
if (!(quickPulseProcessor is null))
{
var quickPulseModule = provider
.GetServices<ITelemetryModule>()
.OfType<QuickPulseTelemetryModule>()
.FirstOrDefault();
quickPulseModule?.RegisterTelemetryProcessor(quickPulseProcessor);
}
config.TelemetryProcessors
.OfType<ITelemetryModule>()
.ToList()
.ForEach(module => module.Initialize(config));
return config;
});
}
}
internal class AiDependencyFilter : ITelemetryProcessor
{
private readonly ITelemetryProcessor _next;
public AiDependencyFilter(ITelemetryProcessor next)
{
_next = next;
}
public void Process(ITelemetry item)
{
var request = item as DependencyTelemetry;
if (request?.Name != null)
{
return;
}
_next.Process(item);
}
}
}