msgraph-sdk-dotnet icon indicating copy to clipboard operation
msgraph-sdk-dotnet copied to clipboard

Out of the box reliability telemetry

Open adamedx opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe. Developers and their customers do not have a way to understand behavior as seen by the client, particularly for timeouts, latency, and other issues. We need a way to see this across all deployments of their application. This can help developers and customers determine the health of infrastructure between the application and Microsoft Graph frontends and distinguish between client issues and problems with API services themselves.

Describe the solution you'd like

  • Ability to opt-in to sharing reliability information (request status code, request id, total time between request and the first response). This must not include identity information such as user or application identifiers or client ip addresses.
  • Ability to configure the destination of this data
  • Ability to control the behavior at runtime (i.e. disable it)
  • The capability should use a standard interface and format for logging and telemetry transport rather than a specialized implementation.
  • Everything about the solution must comply with any policies or regulations regarding data and privacy that apply to the environment running the graph SDK.

Consider using the same approach as the MSAL library which is often used with Microsoft Graph SDK:

  • MSAL feature: https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/issues/3511
  • MSAL Code: https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/tree/main/src/client/Microsoft.Identity.Client/TelemetryCore
  • MSAL Docs (still pending): https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/issues/3541

Describe alternatives you've considered Having developers install additional monitoring tools that log and aggregate Microsoft Graph performance information is certainly possible but requires significant effort from customers.

Additional context Note that if each application is left to define its own reliability tracing coverage will be (and is already!) very uneven. Each application must make the effort to do this raising the cost

adamedx avatar Oct 05 '22 22:10 adamedx