flagd icon indicating copy to clipboard operation
flagd copied to clipboard

[FEATURE] flagd should use standard OTel SDK env vars

Open austindrenski opened this issue 1 year ago • 5 comments

Requirements

Request

Replace --otel-collector-uri/FLAGD_OTEL_COLLECTOR_URI and --metrics-exporter/FLAGD_METRICS_EXPORTER with well-known OTel SDK configuration environment variables.

  • https://opentelemetry.io/docs/concepts/sdk-configuration/
    • https://opentelemetry.io/docs/concepts/sdk-configuration/general-sdk-configuration/
    • https://opentelemetry.io/docs/concepts/sdk-configuration/otlp-exporter-configuration/
services:
  flagd:
    command:
    - start
    - --debug
    - --uri
    - https://raw.githubusercontent.com/open-feature/flagd/main/docs/assets/demo.flagd.json
    depends_on:
      otel:
        condition: service_started
    environment:
-     FLAGD_METRICS_EXPORTER: otel
-     FLAGD_OTEL_COLLECTOR_URI: otel:4317
+     OTEL_EXPORTER_OTLP_ENDPOINT: http://otel:4318
+     OTEL_EXPORTER_OTLP_PROTOCOL: http/protobuf
+     OTEL_METRICS_EXPORTER: otlp
    image: ghcr.io/open-feature/flagd
    ports:
    - 8013:8013
    - 8014:8014

  otel:
    image: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector
    ports:
    - 4317:4317
    - 4318:4318

IMO this approach would provide a better telemetry story overall for flagd since one of the main tenants of the OTel project is to avoid re-inventing the wheel on configuration, but more specifically, I'm opening this issue after running into a pinch-point caused by flagd making an opinionated decision about OTEL_EXPORTER_OTLP_PROTOCOL=grpc.

In my specific use case, supporting the standard OTel SDK env vars would mean that I could, for example, set OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf in order to communicate with an existing OTel Collector gateway cluster that doesn't accept gRPC traffic.

(I could, of course, work around this by making my existing otelcol gateway cluster listen for gRPC traffic too, but (1) there are genuine, non-trivial DevSecOps considerations involved, and more importantly, (2) I don't think flagd intended to be opinionated about this, so hoping there's an appetite for revisiting the current setup.)

austindrenski avatar Jan 08 '24 23:01 austindrenski

Thanks @austindrenski, I'll take a look tomorrow. Using the officially supported environment variables makes sense to me.

beeme1mr avatar Jan 09 '24 02:01 beeme1mr

Hey @austindrenski, thanks for the suggestion. You are right, currently we are opinionated and we force gRPC as protocol to dispatch OTel data to its collector. I think it was a reasonable opinion since FlagD talks gRPC already. The reason for forcing a protocol is due to the OTel Go SDK which currently doesn't rely on OTEL_EXPORTER_OTLP_PROTOCOL to decide which proto to use. Instead, it forces to pass a client to the exporter creation (see here).

I agree that we should support both, http and gRPC :) having just gRPC should not be our opinion :) However, I disagree with removing the explicit configuration. I would rather keep it backward compatible and adapt the code base by doing the following:

  • check for --otel-collector-uri, if present use it, otherwise next step
  • check for env var FLAGD_OTEL_COLLECTOR_URI, if present use it, otherwise next step
  • check for OTEL_EXPORTER_OTLP_PROTOCOL, if present use it, otherwise don't export telemetry data.

thisthat avatar Jan 11 '24 09:01 thisthat

@thisthat How could a user set the protocol if the logic exits the flow if a URI is present? Wouldn't you need the URI and the protocol for this to work propertly?

I was thinking something like this:

  • --otel-collector-uri > env FLAGD_OTEL_COLLECTOR_URI. If neither is defined, OTel NoOps.
  • env OTEL_EXPORTER_OTLP_PROTOCOL or default to grpc.

beeme1mr avatar Feb 27 '24 16:02 beeme1mr

How could a user set the protocol if the logic exits the flow if a URI is present

@beeme1mr I was thinking of parsing the schema part of the URI to determine grpc vs http

thisthat avatar Feb 28 '24 07:02 thisthat