azure-sdk-for-python icon indicating copy to clipboard operation
azure-sdk-for-python copied to clipboard

Resource error

Open jeremydvoss opened this issue 2 years ago • 8 comments

Migrated from old repo issue

Is it possible to make _setup_resources() optional in configure_azure_monitor()? I'm referring to the following segment of code:

def configure_azure_monitor(**kwargs) -> None:
    """This function works as a configuration layer that allows the
    end user to configure OpenTelemetry and Azure monitor components. The
    configuration can be done via arguments passed to this function.

    :keyword str connection_string: Connection string for your Application Insights resource.
    :keyword credential: Token credential, such as `ManagedIdentityCredential` or `ClientSecretCredential`,
     used for Azure Active Directory (AAD) authentication. Defaults to `None`.
    :paramtype credential: ~azure.core.credentials.TokenCredential or None
    :keyword bool disable_offline_storage: Boolean value to determine whether to disable storing failed
     telemetry records for retry. Defaults to `False`.
    :keyword str storage_directory: Storage directory in which to store retry files. Defaults to
     `<tempfile.gettempdir()>/Microsoft/AzureMonitor/opentelemetry-python-<your-instrumentation-key>`.
    :keyword str logger_name: The name of the Python logger that telemetry will be collected.
    :rtype: None
    """

    configurations = _get_configurations(**kwargs)

    disable_tracing = configurations[DISABLE_TRACING_ARG]
    disable_logging = configurations[DISABLE_LOGGING_ARG]
    disable_metrics = configurations[DISABLE_METRICS_ARG]

    # Setup resources
    _setup_resources() # <----- this line

   ....

When developing a containerized FastAPI application, I encounter an issue during live reloads or application startups. The error message is as follows:

Exception  in detector <opentelemetry.resource.detector.azure.vm.AzureVMResourceDetector object at 0xffffa58ef810>, ignoring

This error repeats multiple times with a few seconds interval, ultimately leading to:

Cannot call collect on a MetricReader until it is registered on a MeterProvider

After these messages, the application starts normally. However, this issue is a significant inconvenience during local development, particularly when experimenting with Azure Application Insights.

I suspect _setup_resources() is the root cause, as it seems to configure the environment as if it were an Azure resource. Could this be the case?

Additionally, I've only started encountering this error in version 1.1.0; it wasn't an issue in previous versions.

Lastly, why doesn't this repository reflect the latest version of the package?

jeremydvoss avatar Nov 20 '23 22:11 jeremydvoss

Investigating. @harmankaya I have not been able to reproduce this. Are you seeing this locally or on a non-VM Azure Resource?

jeremydvoss avatar Nov 20 '23 22:11 jeremydvoss

I am seeing this on my local machine and a non-VM Azure resource. I tried downgrading back to 1.0.0 but still got the same issue.

harmankaya avatar Nov 21 '23 06:11 harmankaya

Referencing issue here with steps to reproduce it. This setup worked ~3 weeks ago with a POC setup.

Freshchris01 avatar Nov 27 '23 11:11 Freshchris01

@harmankaya What app frameworks are you using? Ex: flask, django, fastapi, uvicorn, gunicorn...etc @Freshchris01 You linked this issue. Did you mean to link something else?

jeremydvoss avatar Nov 27 '23 22:11 jeremydvoss

Sorry, the company I work for uses FastAPI with uvicorn, more details here: #33295 .

Freshchris01 avatar Nov 27 '23 23:11 Freshchris01

We have release azure-monitor-opentelemetry=1.1.1 . This includes the ability to disable resource detectors. For instance, in order to disable the VM detector, but leave the App Service detector on, the customer can set their environment variable OTEL_EXPERIMENTAL_RESOURCE_DETECTORS="azure_app_service"

Let me know if this solves the problem.

jeremydvoss avatar Dec 05 '23 19:12 jeremydvoss

@harmankaya Please try with the latest version of opentelemetry-resource-detector-azure==0.1.1 https://pypi.org/project/opentelemetry-resource-detector-azure/0.1.1/

jeremydvoss avatar Jan 10 '24 22:01 jeremydvoss

Hi! I just tried it with version 1.1.1 and OTEL_EXPERIMENTAL_RESOURCE_DETECTORS="azure_app_service", it seems to work now.

I have not yet looked deeply into the code, but I am using fastapi in a non-Azure resource, in a docker container, and how does setting OTEL_EXPERIMENTAL_RESOURCE_DETECTORS="azure_app_service" affect the application? Is there a way to set OTEL_EXPERIMENTAL_RESOURCE_DETECTORS to none/empty?

harmankaya avatar Jan 11 '24 06:01 harmankaya

@harmankaya OTEL_EXPERIMENTAL_RESOURCE_DETECTORS="azure_app_service" turns off the VM Resource Detector but leaves on the App Service Detector. Both resource detectors run automatically in order to add appropriate resource attributes for an app on Azure App Service and Azure Virtual Machine. If a user's app is on something else, the resource detectors should do nothing. However, it seems that something was causing the VMResourceDetector to sometimes trigger a blank "Exception()". Resource Detector errors are always ignored, so this blank exception does not crash the app, but does confuse users. The VMResourceDetector was also missing a timeout which was fixed in the recent release.

To summarize:

  1. Upgrade to azure-monitor-opentelemetry==1.2.0
  2. Resource Detector errors do not crash the app.
  3. If you still experience that message and want to turn off a resource detector, manually set OTEL_EXPERIMENTAL_RESOURCE_DETECTORS to exclude it.

jeremydvoss avatar Jan 22 '24 19:01 jeremydvoss

I believe I have discovered the source of the confusing warning: The concurrent.futures system that the OpenTelemetry SDK is using to run the resource detectors is not working correctly for processes that can take longer than 5 seconds.

I have made an issue in the OTel repo to track this.

jeremydvoss avatar Jan 22 '24 21:01 jeremydvoss

The issue stems from an unclear timeout in the OTel SDK. My fix will be in the next release. In order to not trigger the 5 second timeout, the VM Resource Detector now sets its own timeout to 4s. Please update to opentelemetry-resource-detector-azure=0.1.3

jeremydvoss avatar Jan 25 '24 20:01 jeremydvoss