opentelemetry-specification Should file configuration merge environment variable configuration?

The conversation about whether file configuration should completely ignore the sdk environment variable scheme came up in #3744, but that PR doesn't actually contain any language related to this.

The original file configuration OTEP stated:

Interpret the configuration model and return SDK TracerProvider, MeterProvider, LoggerProvider which strictly reflect the configuration object's details and ignores the opentelemetry environment variable configuration scheme.

As mentioned here, file configuration doesn't actually contain language describing this behavior. It was included originally included in #3437 but was lost in the PR review shuffle - accidentally, not in response to feedback.

@tedsuo argues in favor file configuration respecting env vars with:

The common expectation among developers is that env vars will automatically overwrite config parameters. If we do it the other way, I am concerned that until the end of time we will have a steady stream of users lodging issues about this and then becoming very frustrated when we explain that they need to modify the config file to use an env var.

The reason that these users will be upset is that they are in a situation where having to define the env vars in the config file is a non-starter. Their use case is that for some reason they either can't get at or are not allowed to modify the config file template, but they really need to override a parameter.

For reference, this is an issue that has come up on many OSS projects I have been involved with where it is common to have both operators and application developers wanting to configure the same thing. Often it's some kind of emergency situation where the person with the rights to change the config file is unavailable.

I should note that of course the opposite situation could be true, where for some reason you want to disable an env var but don't have access to it. But it's probably easier to give users the ability to disable an env var via the config file than it is to give them the ability to disable a config parameter via an env var.

@MrAlias argues in favor of ignoring env vars with:

To be fair, from experience, you're going to get people complaining either way. If you choose environment variable priority over a configuration users will complain that their deployment was altered and failed when an environment variable was set that took precedence.

However, if you make environment variables take precedence you will also need to make some pretty sever and subjective choices on how they are mapped to a config. Do the BSP environment variables apply to all batch span processors or just one? Is the sampler environment variable use for all tracer providers in the config, even ones that specify alternate samplers? Should propagators be merged or overridden? If an exporter is defined by environment, does that stop the console logging exporter used for debugging as well as all the other exporters defined in configuration?

Ultimately, I think the current changes are going to be the most appropriate. They allow users to make their own choice in precedence without making subjective choices for them on how to map things. If a user want environment variables to take precedence, all they need to do is use the OTel environment keys in the related parts of their configuration. In doing so they will answer each of the above questions their own way.

@trask supports the feeling of users expecting env vars to override file configuration, but also says merging configuration from multiple sources is hard:

This is my feeling as well, especially for things like:

OTEL_SDK_DISABLED OTEL_RESOURCE_ATTRIBUTES OTEL_SERVICE_NAME OTEL_LOG_LEVEL OTEL_PROPAGATORS OTEL_TRACES_SAMPLER_ARG I totally understand the nightmare that is merging configuration from multiple sources though.

I wonder if we would have created many of the other env vars (e.g. OTEL_BSP_, OTEL_BLRP_) if we had configuration file support from the beginning? And if so, maybe we can deprecate those other env vars in favor of configuration file?

This topic came up several times during the lengthly review of the file configuration OTEP. Below are links to a number of and relevant points:

https://github.com/open-telemetry/oteps/pull/225#discussion_r1116269308

Layering of config as described below would make it more difficult to reason about what my program is actually being run with.

Additionally, perhaps give them a helper config file that uses env var substitution in the right places so that they can migrate easily (and still get a warning until they get rid of env variables and move everything to the config file).

https://github.com/open-telemetry/oteps/pull/225#discussion_r1119068865

What about 'Solution 3: fail when both environment configuration and file configuration are present'?

I think we could log a warning when we detect this, but failing is too strong. Consider the implications if a user is operating in an environment they don't fully control (i.e. where an ops team configures environment variables by default which they extend / layer on top of).

https://github.com/open-telemetry/oteps/pull/225#discussion_r1142380977

I have a (rather strong) opinion that setting set via env vars has higher priority (takes precedence) over a setting set via configiuration file and it should be marked as a goal.

Any scheme where environment variables have priority over a config file will require some sort of standard mapping between the environment variable schema and file config scheme. IMO, its impossible to define such a mapping which is intuitive in all cases, so better not to try.

Nothing is forcing users to use file based config - its opt in. If they do opt in, they're opting into the documented behavior in which the config file represents the source of truth for configuration. If they wish to customize the experience with additional layers / overrides, they have a couple of tools:

They can use the fact the Configure(config) API accepts a config model as an argument, and provide their own customizations to the model after the initial parsing of the file via Parse(file). An example of such a customization would be to interpret environment variables and apply them to the model in a way they decide makes sense.

They can use environment variable substitution to reference environment variables directly in a configuration file.

Oct 31 '23 16:10 jack-berg

My point of view is that while I understand that there is precedence in other systems to have environment variables override other configuration sources, we shouldn't do it. The environment variable configuration scheme does not map cleanly to file configuration, leading to unintuitive behavior when trying to merge. A partial mapping of some of the environment variables is also unintuitive. If we give support environment variable substitution and default values (#3744) that will allow us to provide a file configuration "template" to users as a starting point, complete with comments and env var substitution references to the environment variable scheme with default fallbacks. Note this was the conclusion of OTEP conversation https://github.com/open-telemetry/oteps/pull/225#discussion_r1119068865.

What's not to like? Users start out with the template that essentially is performing the merging of the environment variable scheme with the file configuration scheme (where it makes sense). The template has comments which reinforce that if they delete the env var substitution references, those environment variables will not be considered during parse / create. This should make migrating from environment variable schema to file configuration smooth, while also allowing the implementation to have simple intuitive rules that are easy to explain and which are logically consistent. While some users will inevitably ignore the documentation and expect environment variables to trump file configuration anyway, they'll surely be able to understand the motivation when pointed to the docs.

Oct 31 '23 16:10 jack-berg

I agree with @jack-berg (https://github.com/open-telemetry/opentelemetry-specification/issues/3752#issuecomment-1787580645). It's better to have a clean break in both user interface and SDK components interface: components should no longer support env vars directly, they should always take parameters from a config object, and the config object can support env vars as placeholders, but without any semantic meaning attached to their names (it's up to the user to pick those names). It is minimalistic design with clean encapsulation and separation of responsibilities. In that sense env vars do override config file, but only because user explicitly writes them this way, not because of some magic mapping between env var name and the exact position in the config.

Having said that, before we go this way we need to have a very clear migration strategy to avoid introducing breaking changes. The default config template that pulls in all the env vars already defined in the spec is a good approach. What's not clear to me there is whether it requires support for conditionals, because, for instance, in order to have a place in the config to refer to some var related to OTLP exporter, we need to know that OTLP exporter is what the user actually wants to use. It may be possible to accommodate using the declare/use separation used in the Collector config, i.e. the template config will always have a section for OTLP exporter (and all its env vars), but the tracer config portion may not reference the exporter as "I want to use it".

So basically I think we need a working prototype of the config to validate that this approach is workable.

Another migration concern is whether it can be incremental - there are many components in the SDK and requiring they all to be upgraded to config object style of initialization (vs env vars) before anything can be released is going to be a problem. The above approach with a default config template seems like it could be incremental.

One more open question - how such default config template will play with the ability to minimize runtime dependencies of the SDK?

Oct 31 '23 17:10 yurishkuro

Please beware that popular frameworks have their config systems in place for years, creating yet another config file might be nice from the OTel point of view but not for the frameworks that already instaciate an OTel SDK of their own.

Placing all possible configs in a OTel config will require frameworks to scan for services there, as example, if we want them to work on native mode. Frameworks already do many other things like, providing defaults, validations and value transformations. All this is already done and would be bypassed by the new OTel Config file.

From my point of view, the properties supplier from the SDK must have the highest priority, higher then the OTel config file. More, there shouldn't be any configuration of the SDK that cannot be performed by using the properties supplier.

In relation to the env. vars., in all frameworks I can remember, they take precedence over any other kind of configuration. OTel shouldn't be special.

Jan 18 '24 18:01 brunobat

From my point of view, the properties supplier from the SDK must have the highest priority, higher then the OTel config file. More, there shouldn't be any configuration of the SDK that cannot be performed by using the properties supplier.

These are java specific concepts @brunobat. Can you generalize this feedback to the spec level?

Jan 18 '24 19:01 jack-berg

Please beware that popular frameworks have their config systems in place for years, creating yet another config file might be nice from the OTel point of view but not for the frameworks that already instaciate an OTel SDK of their own. Frameworks already do many other things like, providing defaults, validations and value transformations. All this is already done and would be bypassed by the new OTel Config file.

Popular frameworks have their own config systems. That's great - if those frameworks want to make opentelemetry a first class citizen, they can evolve those frameworks to be able to able to configure the things opentelemetry users expect (the flat scheme provided by environment variables very quickly runs into problems expressing very real config scenarios). But not all users buy into these frameworks - do we leave these users behind, only providing a suboptimal environment variable schema?

File configuration isn't and doesn't need to be the only configuration option. It doesn't erase the environment variable scheme, or programatic configuration which enables all sorts of alternative configuration mechanisms like those provided by frameworks. But we do need a language agnostic way to fully express the desired configuration of an SDK.

Is the request to do something different with file configuration or to not have a file configuration option at all?

Placing all possible configs in a OTel config will require frameworks to scan for services there, as example, if we want them to work on native mode.

This appears to be no different than the problem we face today in opentelemetry-java with the SPIs to load custom exporters. With opentelemetry-sdk-extension-autoconfigure, you can set OTEL_TRACES_EXPORTER=foo, and if there is a ConfigurableSpanExporterProvider implementation on the classpath corresponding to foo, autoconfigure will use it to create the corresponding SpanExporter.

In relation to the env. vars., in all frameworks I can remember, they take precedence over any other kind of configuration. OTel shouldn't be special.

Below I've listed several examples where trying to merge environment variables with file configuration yields unintuitive / unexpected results. For me to take this request seriously, I need to see proposals on how these situations would be resolved:

Simple Priority

We have the same options in env vars and config file, but different values. Which wins? Easy enough - environment variables always win. Although, we'll no doubt have users complaining that what they see in their config file isn't reflected in reality.

Env vars

OTEL_TRACES_EXPORTER=otlp
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317

File config

tracer_provider:
  processors:
    - batch:
         exporter:
           otlp: 
             endpoint: http://some-other-endpoint:4317

Conflict example 1

We have env variable information which can be merged with file configuration such that both are true. In this case, env variables specify to use the OTLP exporter, while file configuration says zipkin. One person might expect to see the OTLP exporter overwrite zipkin. Another might expect to see spans exported to both OTLP and zipkin. Yet another might expect to see only zipkin.

Env vars

OTEL_TRACES_EXPORTER=otlp

File config

tracer_provider:
  processors:
    - batch:
         exporter:
           zipkin: http://localhost:9411/api/v2/spans

Conflict example 2

In this case we have configuration which can be merged, but doing so will almost certainly yield the wrong result. The environment variables specify the OTLP trace endpoint, assuming the http/protobuf default protocol, which includes the path. The config file specifies grpc protocol. If the environment variable takes priority over file configuration, http/protobuf variant of the endpoint specified via environment variable will be used to configure the OTLP grpc exporter specified in file config, causing an error.

Env vars

OTEL_TRACES_EXPORTER=otlp
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://some-other-endpoint:4318/foo/bar/v1/traces

File config

tracer_provider:
  processors:
    - batch:
         otlp:
             endpoint: http://some-endpoint:4317
             protocol: grpc

Conflict example 3

In this case we see how the flat nature of the environment variable scheme falls short when trying to merge with the structure of a config file. The config file specifies that the sdk should export spans to two different endpoints, but an environment variable is specified which sets the OTLP endpoint to something else. If you override both from the config file you're almost certainly not doing what the user would want. And if you change the effective config to result in a single OTLP exporter with the endpoint in the environment variable, you're also almost surely not doing what the user wants. Presumably, they want to override one of the endpoints in the config file, but its impossible to know that they really want this, or which in the config file to override.

Env vars

OTEL_TRACES_EXPORTER=otlp
OTEL_EXPORTER_OTLP_ENDPOINT=http://yet-another-endpoint:4318

File config

tracer_provider:
  processors:
    - batch:
         otlp:
             endpoint: http://some-endpoint:4318
    - batch:
         otlp:
             endpoint: http://some-other-endpoint:4318

I can keep listing examples, but the point is that I don't know of any strategy for overlaying environment variables on top of a configuration file that will make sense to most users (yet alone all) in most cases. Instead, we should aim to give simple primitives which can be combined to accommodate simple and complex use cases:

Provide an in memory representation of a config model
Provide an operation to parse a config file to an in memory config model
Provide an operation to create a configured SDK from an in memory config model
Provide a way to perform environment variable substitution in a config file
Provide an environment variable with a value pointing to a config file, which triggers file configuration

Most uses will will be fine just saying OTEL_CONFIG_FILE=/path/to/sdk-config.yaml, and be content with the simply ability to do environment variable substitution in a config file.

Users with complex requirements can combine these primitives in all sorts of interesting way:

produce an in memory config model from some alternative source
define merge logic for combining multiple in memory config models
define logic for overlaying the environment variable scheme on top of a in memory config model in a way that makes sense for them

Jan 18 '24 21:01 jack-berg

These are java specific concepts @brunobat. Can you generalize this feedback to the spec level?

Yes, will send a PR to the sdk-configuration file.

The spec actually doesn't define any priority over config methods, just states "The SDK MUST provide a programmatic interface for all configuration". I agree with the statement. This can be interpreted as the file config is just another SDK configuration method and shouldn't take precedence over the base generic programatic interface, which in the Java case is represented by the properties supplier.

The OTel file config must use the base generic programatic interface and shouldn't prevent the use of other existing or future configuration systems of the SDK... Which seems totally reasonable.

Jan 18 '24 23:01 brunobat

Simple Priority

We have the same options in env vars and config file, but different values. Which wins? Easy enough - environment variables always win. Although, we'll no doubt have users complaining that what they see in their config file isn't reflected in reality.

Sure someone will complain, but overriding configs defined in files with env. vars. is widely accepted and common place in the container world. It can be decided either way if properly documented, however not giving higher priority to env. vars. nowadays seems counter intuitive and agains the industry practice.

Conflict example 1

In this case the exporter should be OTLP and the endpoint the one specified in the file. Note that the endpoint shouldn't be specific to zipkin, but owned by the exporter.

These overrides happen all the time in a microservice. Files define the base config and env. vars. define the exceptions to particular attributes. I acknowledge that there can be many overrides.

Conflict example 2

Yes, the resulting merged config doesn't make sense, but the error can also happen with the file based config. Validation will be always needed and should be implemented to save people's time.

Conflict example 3

This is an excellent example of things that cannot be properly configured by env. vars. with the current abstraction, which leads me to the main point....

I think we are mixing the definition of a "programmatic interface for all configuration" with the definition of the file based configuration. This makes the programatic interface effectively useless for other configuration systems because the file config takes precedence and will include things that cannot be done in any other way. The programmatic interface should be independent if we want to properly configure things with files, env. vars. and other config systems, as discussed above.

We should have a generic reusable programmatic interface for the config, a builder pattern for it, and only then the file format configuration for it. Merging and validating the config attributes should also be a task for the builder.

Jan 19 '24 00:01 brunobat

I think we are mixing the definition of a "programmatic interface for all configuration" with the definition of the file based configuration. This makes the programatic interface effectively useless for other configuration systems because the file config takes precedence and will include things that cannot be done in any other way.

Why does file configuration make the programmatic interface effectively useless?

File configuration is an abstraction that is built on top of the programmatic configuration interface. It will be natural to package file configuration in a separate artifact than the core SDK components it configures. By definition, it can not be more expressive than the programmatic interface since ultimately all options need to be translated to programmatic equivalents. The proposal in #3805 in which file configuration takes priority over the environment variable scheme is only true if the user opts into it by: 1. Including the necessary file configuration artifact. 2. Specifying OTEL_CONFIG_FILE=...

A framework that has its own configuration system can avoid file configuration entirely by not including the artifact, and / or by providing equivalent functionality. I.e. the framework provides its own format and schema for specifying configuration, and provides logic which parses, validates that configuration and uses the programmatic configuration interface to produce an SDK according to the configuration.

We should have a generic reusable programmatic interface for the config, a builder pattern for it, and only then the file format configuration for it.

💯 That's exactly what's happening.

Jan 19 '24 16:01 jack-berg

💯 That's exactly what's happening.

That's excellent.

Why does file configuration make the programmatic interface effectively useless? From what's written in the spec, nothing. However, when discussing the particular java implementation of the file configuration it was mentioned that some configs would only be available through the file config.

If all configurations will be available programmatic interface, which in the Java implementation case is represented by the properties supplier, I'm ok.

Jan 22 '24 09:01 brunobat

If I have 2 different envs, say test and prod, does the file spec allow for me to specify an env var for that and effectively create 2 different configs from the one file, or are there restrictions which would mean for some things I would have to have 2 different files?

Obviously for the existing description I can provide 2 different files and select one in OTEL_CONFIG_FILE, but my use case here is where I hold a single central template file (thinking opamp in the future) and I want to setup config from that one template based on which env the agent runs

Jan 22 '24 16:01 jackshirazi

As for the overall question, I've seen this page reading like this for a long time now.

So it would be surprising to me that suddenly OTEL_SERVICE_NAME (to choose just the top one there) no longer works when I - or anyone else in the chain of operators who could be involved in setting up my application+agent - set a file config. Yes, I agree it's opt-in so I should know that using the file means the env var is then ignored. It's still surprising. And we generally prefer the principle of least surprise. I would tend towards merging

For the conflict objections, I would do the simplest merge, and with an option to output all values, it's straightforward to debug. So yes there would be conflicts, but they would mostly be easily caught in dev/test and resolved.

What we see from our customers is that they like to use different configuration capabilities for different things. File config as a base that they can distribute easily. Environment to customize that. Central config for ease of changing config across many different applications (especially for dynamically adjustable options). The current proposal is to implement the file config so it accepts environment, but not the ones already supported, which means that for the existing deployments they already have configured, the file config would need to explicitly include each of those. That is, somewhere in my file config I'll need to have the service name defined and to use ${OTEL_SERVICE_NAME} - otherwise I either can't use file config or I have a painful adjustment in my systems to define a bunch of new things.

As I write this, I'm thinking maybe a reasonable compromise is to provide a file config with all those variables already defined in the file as a template. Of course that doesn't cover all situations and effectively adds boilerplate which is another anti-pattern, but it would be something worth doing if we stay with no merging

Jan 22 '24 17:01 jackshirazi

I think it's going to be continually surprising to users that all of the standard OTEL_* environment variables are ignored as soon as you introduce a yaml configuration file (e.g. to configure metric views).

At least in the Java world, it's super standard (and people expect) for env vars to override configuration files (and not in an all-or-nothing way).

I appreciate that this problem may not have a perfect solution, but I'd like to explore a compromise that could be less surprising to users.

For each env var we could define the minimal affect it has on the config file (as opposed to an all-or-nothing approach).

OTEL_TRACES_EXPORTER is probably the most complex, and so let's see what we could do for it first.

We could say that OTEL_TRACES_EXPORTER=abc means drop other exporters besides abc, while adding abc exporter with default configuration if there is not already at least one abc exporter present. (Another option is to drop all exporters including abc, and add abc exporter with the default configuration, but I think that would be more surprising because it drops all existing configuration of the abc exporter).

OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is simpler, and we could say it overrides every instance in the config file of tracer_provider.processors.batch.exporter.otlp.endpoint.

Applying these rules to @jack-berg's examples:

Conflict example 1

OTEL_TRACES_EXPORTER=otlp

and

tracer_provider:
  processors:
    - batch:
        exporter:
          zipkin:
            endpoint: http://localhost:9411/api/v2/spans

would result in

tracer_provider:
  processors:
    - batch:
        exporter:
          otlp:

Conflict example 2

OTEL_TRACES_EXPORTER=otlp
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://some-other-endpoint:4318/foo/bar/v1/traces

and

tracer_provider:
  processors:
    - batch:
        exporter:
          otlp:
            endpoint: http://some-endpoint:4317
            protocol: grpc

would result in

tracer_provider:
  processors:
    - batch:
        exporter:
          otlp:
            endpoint: http://some-other-endpoint:4318/foo/bar/v1/traces
            protocol: grpc

Note: this is probably an incorrect configuration (using an http/protobuf endpoint and grpc protocol), but I think it's probably the least surprising merge given the user only used env var to override the endpoint and not the protocol.

Conflict example 3

OTEL_TRACES_EXPORTER=otlp
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://yet-another-endpoint:4318

and

tracer_provider:
  processors:
    - batch:
        exporter:
          otlp:
            endpoint: http://some-endpoint:4318
    - batch:
        exporter:
          otlp:
            endpoint: http://some-other-endpoint:4318

would result in

tracer_provider:
  processors:
    - batch:
        exporter:
          otlp:
            endpoint: http://yet-another-endpoint:4318
    - batch:
        exporter:
          otlp:
            endpoint: http://yet-another-endpoint:4318

Note: this is probably an incorrect configuration (having two otlp exporters pointing to the same endpoint), but again I think it's probably the least surprising merge, and therefore probably the best.

As @jackshirazi says above:

For the conflict objections, I would do the simplest merge, and with an option to output all values, it's straightforward to debug. So yes there would be conflicts, but they would mostly be easily caught in dev/test and resolved.

Jan 22 '24 23:01 trask

Its not perfect but I could get on board with that merge logic. If we were to do this, we should give the user a way to understand what the resolved configuration model looks like. The natural thing to do is to log out the resolved model after applying environment variable overlays, but considering it may have secrets in it, we wouldn't always be able to do that. Instead, we could:

Detect if any environment variables were present and if so, log a warning message indicating the resolved configuration is not what was specified in the configuration file.
Provide details in the log message on how to opt-in to printing the full resolved configuration (akin to otel.javaagent.debug=true) where the user assumes the risk of any secrets that may be printed

Jan 22 '24 23:01 jack-berg

There was something else we mentioned in the SIG meeting. Let's think about what happens right now for any implementation that does not have the configuration prototype:

Let's say that these environment variables are defined (I'll use here the same example @jack-berg used during the SIG meeting):

OTEL_TRACES_EXPORTER=otlp
OTEL_EXPORTER_OTLP_ENDPOINT=http://yet-another-endpoint:4318

And the user has this code in the application:

SdkTracerProvider.builder()

        .addSpanProcessor(BatchSpanProcessor.create(OtlpGrpcSpanExporter.create("http://some-endpoint:4318")))

        .addSpanProcessor(BatchSpanProcessor.create(OtlpGrpcSpanExporter.create("http://some-other-endpoint:4318")))

When that combination of environment variables and application code are executed something happens, let's give this something a name: X.

Now, the code above is equivalent to some configuration file, because when a certain configuration file is used, we also get X.

If we think about this project for a moment, we can also see it not as "configuration of OpenTelemetry" but as "configuration of a script that creates some certain OpenTelemetry objects before the application is run". Every configuration file makes this script be execute in a certain way whose results could also be achieved by executing some certain code directly in the application.

So, we can see this problem in this way:

What happens when OTel is executed when there are environment variables set and also a configuration file?

And the answer to that question could be:

The same that would happen if the equivalent objects that are created with the configuration file would have been instantiated in the application with the same environment variables set.

I think users could understand this configuration project better if we present it not as configuration of OpenTelemetry in the way that environment variables configure OpenTelemetry, but as a way for them to define which and how certain components are instantiated before the rest of the application runs.

Jan 23 '24 01:01 ocelotl

I gave this issue more thought and now I think that this could be not an issue at all. Instead of trying to find an algorithm to either merge or override or something else, let's just tell the users, this is what the configuration file component would end up instantiating if you run OpenTelemetry with the environment variables that are set right now. To do this, I propose we add to the configuration file component an option (maybe named dry_run or something similar) that instead of running anything would just print the code that would instantiate the same objects that would be instantiated by the configuration file component if it was executed normally. In this way, the user can adjust their environment variables and find out the result that changing them would have on the instantiated objects and we don't have to figure out any algorithm to solve conflicts between the environment variables and configuration files.

I implemented an example that partially prints the code that would correspond to an instantiation of a tracer provider, running this produces this:

TracerProvider(
    Sampler(
        always_off=None,
        always_on=None,
        jaeger_remote=None,
        parent_based=ParentBasedSampler(
            local_parent_not_sampled=LocalParentNotSampledSampler(
                always_off=None,
                always_on=None,
                jaeger_remote=None,
                parent_based=ParentBasedSampler(
                    local_parent_not_sampled=None,
                    local_parent_sampled=None,
                    remote_parent_not_sampled=RemoteParentNotSampledSampler(
                        always_off=None,
                        always_on=None,
                        jaeger_remote=None,
                        parent_based=None,
                        trace_id_ratio_based=ParentBasedTraceIdRatio(
                            TraceIdRatioBased(
                                0.0001
                            ),
                            StaticSampler(
                                Decision(
                                    2
                                )
                            ),
                            StaticSampler(
                                Decision(
                                    0
                                )
                            ),
                            StaticSampler(
                                Decision(
                                    2
                                )
                            ),
                            StaticSampler(
                                Decision(
                                    0
                                )
                            ),
                        ),
                    ),
                    remote_parent_sampled=None,
                    root=None,
                ),
                trace_id_ratio_based=None,
            ),
            local_parent_sampled=LocalParentSampledSampler(
                always_off=None,
                always_on=StaticSampler(
                    Decision(
                        2
                    )
                ),
                jaeger_remote=None,
                parent_based=None,
                trace_id_ratio_based=None,
            ),
            remote_parent_not_sampled=RemoteParentNotSampledSampler(
                always_off=StaticSampler(
                    Decision(
                        0
                    )
                ),
                always_on=None,
                jaeger_remote=None,
                parent_based=None,
                trace_id_ratio_based=None,
            ),
            remote_parent_sampled=RemoteParentSampledSampler(
                always_off=None,
                always_on=StaticSampler(
                    Decision(
                        2
                    )
                ),
                jaeger_remote=None,
                parent_based=None,
                trace_id_ratio_based=None,
            ),
            root=RootSampler(
                always_off=None,
                always_on=None,
                jaeger_remote=None,
                parent_based=None,
                trace_id_ratio_based=ParentBasedTraceIdRatio(
                    TraceIdRatioBased(
                        0.0001
                    ),
                    StaticSampler(
                        Decision(
                            2
                        )
                    ),
                    StaticSampler(
                        Decision(
                            0
                        )
                    ),
                    StaticSampler(
                        Decision(
                            2
                        )
                    ),
                    StaticSampler(
                        Decision(
                            0
                        )
                    ),
                ),
            ),
        ),
        trace_id_ratio_based=None,
    ),
    Resource(
        reprOrderedDict(
            [
                (
                    "telemetry.sdk.language",
                    "python",
                ),
                (
                    "telemetry.sdk.name",
                    "opentelemetry",
                ),
                (
                    "telemetry.sdk.version",
                    "1.23.0.dev0",
                ),
                (
                    "service.name",
                    "unknown_service",
                ),
            ]
        ),
        schema_url="https://opentelemetry.io/schemas/1.16.0",
    ),
)

Jan 23 '24 06:01 ocelotl

My big concern with having the yaml file overrule all of the configuration:

Yaml is not as easy as defining an environment variable.
If you use the environment variables as well as the yaml file, you have not duplicate the environment variables in the yaml file. Otherwise, the variables will be get ignored.

With this, I support to have the file configuration automatically include the environment variables. If the same configuration defines in the file and also as an environment variable, the value from the file rules. By the way, in MicroProfile Config, the environment variable and system properties are also opt in to provide configurations by default.

Jan 23 '24 16:01 Emily-Jiang

I think it's going to be continually surprising to users that all of the standard OTEL_* environment variables are ignored as soon as you introduce a yaml configuration file (e.g. to configure metric views).

@trask I don't know how surprising it would be (after all they are making a decision to use a config), but overall I would strive for less complexity rather than more. The existing situation with env vars is already pretty complex, and devising overlaying rules with config adds even more complexity (the content of your comment is a perfect illustration of that complexity). I don't feel that this complexity is warranted, given that standard variable substitution in the config provides the same customization capabilities, it's a well-understood solution without additional mental overhead.

Jan 23 '24 17:01 yurishkuro

I propose we add to the configuration file component an option (maybe named dry_run or something similar) that instead of running anything would just print the code that would instantiate the same objects that would be instantiated by the configuration file component if it was executed normally. In this way, the user can adjust their environment variables and find out the result that changing them would have on the instantiated objects and we don't have to figure out any algorithm to solve conflicts between the environment variables and configuration files.

@ocelotl The idea of having code that prints code seems brittle and hard to maintain.

But I'm also having a little trouble understanding what you are suggesting, but let me to restate my understanding:

You're coming from a standpoint where a language, like python, implements the interpretation of the environment variable scheme directly in components rather than in a separate artifact. I.e. the Otlp exporters directly interpret OTEL_EXPORTER_OTLP_* environment variables.
File configuration strongly suggests decoupling interpreting the configuration model from the SDK components, but ultimately, the Create operation still has to instantiate components like the OTLP exporter, which in some languages may be performing additional interpretation of environment variables.
You're saying that we don't need to solve the merge problem because languages that interpret environment variables inside of components already have merge semantics. So instead, just give tools for printing out how how these things interact.

I don't think I like this. It essentially leaves the merge semantics to be a language level decision, which reduces portability of configuration.

If I'm misinterpreting the proposal please let me know.

Jan 23 '24 22:01 jack-berg

It essentially leaves the merge semantics to be a language level decision, which reduces portability of configuration.

I think this is a great argument against overlaying! We already have a mess of compatibility matrices with varied support for different env vars. Saying No to overlaying just removes this problem altogether - languages only need to implement one generic variable substitution, not a mish-mash of implementations in each and every component.

Jan 23 '24 22:01 yurishkuro

@ocelotl The idea of having code that prints code seems brittle and hard to maintain.

Maybe it is different for other languages but in Python it's quite simple, we only need to add a well-known method to every class.

You're coming from a standpoint where a language, like python, implements the interpretation of the environment variable scheme directly in components rather than in a separate artifact. I.e. the Otlp exporters directly interpret OTEL_EXPORTER_OTLP_* environment variables.

Correct. That's what we currently do in Python. This is an important point, more about this later.

File configuration strongly suggests decoupling interpreting the configuration model from the SDK components, but ultimately, the Create operation still has to instantiate components like the OTLP exporter, which in some languages may be performing additional interpretation of environment variables.

Ok, I think I agree with this...

You're saying that we don't need to solve the merge problem because languages that interpret environment variables inside of components already have merge semantics. So instead, just give tools for printing out how how these things interact.

Mostly correct, more about this later too.

I don't think I like this. It essentially leaves the merge semantics to be a language level decision, which reduces portability of configuration.

The merge semantics are already a language decision. If they are currently a problem, we have to fix that problem where it currently is, not by adding the configuration file component. The file configuration component cannot be the solution for this problem because:

Even after the file configuration component is added, users will still be able to run OTel without using it.
The file configuration component trying to merge or override the environment variables would be an additional problem because there is no right way of doing this (or maybe it is (Dynaconf may have an algorithm, more below)? :thinking:) kind of operation.

@jack-berg I think that one of your goals for the file configuration component is to provide developers with a clean, decoupled mechanism to obtain configuration values (something like configuration = Configuration(); configuration.otel_exporter_timeout == 10). That would be great (in fact I even tried to do the same thing long time ago). While working on that I realized there already existed a Python project that did the same thing, Dynaconf.

Now, 20 seconds ago, while writing this I looked into this project again and found that they may have an algorithm that we could use (hope this helps!).

If I understand @yurishkuro comments right, I think @yurishkuro and I agree (@yurishkuro please correct me if I am wrong). We should not try to add an arbitrary or not-so-well defined algorithm to merge the environment variables and configuration files.

So, to summarize a few things:

Regardless of what we decide on this merging/overriding/etc issue, I see value in allowing the user to know what objects are to be instantiated by the configuration file component. This can help users clearly understand what the file configuration component is doing which is essential for debugging. For Python I have implemented this feature by printing the equivalent code, maybe for other languages there are better approaches.
I am not opposed to merging environment variables with configuration values per se. I am opposed to merging environment variables with configuration values using an arbitrary or not-so-well defined approach. If we can't find a clean way of doing this, we are better not doing it at all.
It would be great to have a "configuration" object that we can use in our code that would abstract developers from having to use the "raw" values of environment variables (or maybe configuration files too). From my past experience implementing that I remember it was nice to have something that would automatically transform an enviromnent value string "true" into a boolean value True that we could use in the SDK. Nevertheless, if we want to have this feature work with environment variables and configuration files too, we first need to find a clean, proper algorithm to do the merging/overriding, if not, we are better without this feature as well.

Jan 23 '24 23:01 ocelotl

Now, 20 seconds ago, while writing this I looked into this project again and found that they may have an algorithm that we could use (hope this helps!).

In Java there is Microprofile Config. It's a spec with many implementations to manage configurations: https://github.com/eclipse/microprofile-config

It would be great to have a "configuration" object that we can use in our code that would abstract developers from having to use the "raw" values of environment variables (or maybe configuration files too).

I agree.

I think it makes sense to discuss about configuration sources. We now have file and env./sys. vars. (there might be more in the future?) and I think it make sense to decide if we are going to have a hierarchy of sources or if they will be flat and it's up to the implementations to figure out the merge. I think leaving the merge behaviour to the language specific implementations would lead to inconsistencies and mess up the config of large microservice deployments, with services implemented in many languages.

Jan 24 '24 08:01 brunobat

Do we know of any (other) libraries/frameworks that have standard env vars and a standard configuration file format where the standard env vars don't take precedence (and not in an all-or-nothing way)? I can't think of any in the Java space, but maybe this is common in other ecosystems?

Jan 24 '24 14:01 trask

CC @kittylyst

Jan 24 '24 15:01 brunobat

I think leaving the merge behaviour to the language specific implementations would lead to inconsistencies and mess up the config of large microservice deployments, with services implemented in many languages.

Yes, I agree. Just to be clear, my point is that even without this file configuration component, this problem can happen (and probably happens) right now, something like this:

# An environment variable is set beforehand
OTEL_EXPORTER_ENDPOINT_URL="http://some.url"

...

# Here an OTelExporter is instantiated, endpoint_url is optional
# and its default value is an empty string.
exporter = OTelExporter(endpoint_url="http://some.other.url")

Which value will the exporter have for endpoint_url?

Again, I agree, it's a bad thing to leave the final value of endpoint_url to be each language decision.

Jan 24 '24 16:01 ocelotl

The merge semantics are already a language decision. If they are currently a problem, we have to fix that problem where it currently is, not by adding the configuration file component.

I don't think we actually have to solve that.

Consider if we agree to state that file configuration should ignore the existing environment variable scheme: In this case, implementors of the Create method would have to invoke the programmatic APIs of components in a way that ensures that environment variables ignored. If not, the implementation would not be compliant.

Now consider we take the opposite stance, and state that we want to merge the environment variable scheme with file configuration: In this case, we definitely want consistency across languages in terms of how the merge semantics work. Its unlikely that the existing implementations are consistent, so we'd probably have to:

Define the merge semantics in the spec.
Implementations with existing merge semantics in conflict with the spec would need to find a way to ignore those existing mechanisms in favor of the new rules. If not, the implementation would not be compliant.

Regardless of what we decide on this merging/overriding/etc issue, I see value in allowing the user to know what objects are to be instantiated by the configuration file component. This can help users clearly understand what the file configuration component is doing which is essential for debugging. For Python I have implemented this feature by printing the equivalent code, maybe for other languages there are better approaches.

Yes this is important. In java, all of our SDK components implement public String toString() and print out their configuration using a format which is idiomatic in java. We already use this today to allow the user to understand their effective resolved SDK after various customization layer have muddied things up. For file configuration, we should have the added ability to take a resolved configuration model and print it back out to YAML. Ideally, this directly describes an SDK. But in practice, there may be small discrepancies between a model and an SDK (e.g. consider a model with a exporter property the exporter doesn't know about, or that the file configuration Create method doesn't yet know how to interpret).

I am opposed to merging environment variables with configuration values using an arbitrary or not-so-well defined approach. If we can't find a clean way of doing this, we are better not doing it at all.

The dynaconf example you give is interesting, but it essentially just describes what the equivalent environment variable is to target a nested property in a configuration model. That helps us if we wanted to introduce an entirely new mechanism environment variable schema with names derived from the configuration model schema. (This is worth considering). But I can't see what it tells us about merging the existing environment variable scheme with file configuration.

It would be great to have a "configuration" object that we can use in our code that would abstract developers from having to use the "raw" values of environment variables (or maybe configuration files too). From my past experience implementing that I remember it was nice to have something that would automatically transform an enviromnent value string "true" into a boolean value True that we could use in the SDK. Nevertheless, if we want to have this feature work with environment variables and configuration files too, we first need to find a clean, proper algorithm to do the merging/overriding, if not, we are better without this feature as well.

That's what we're trying to achieve with the configuration model. In #3840 I propose that we introduce a new dedicated operation for updating a configuration model with environment overloads. The effect would be that a user could call Create(configurationModel) to configure from a model ignoring environment variables, or call Create(MergeEnvironment(configurationModel)) to overlay the environment variable scheme on top of the model before calling create. The design philosophy mirrors that of unix, with small focussed programs (in this case functions) which can be combined to accommodate a wide variety of requirements.

I want to draw attention to this comment from @trask:

Do we know of any (other) libraries/frameworks that have standard env vars and a standard configuration file format where the standard env vars don't take precedence (and not in an all-or-nothing way)?

If there aren't examples of this, then that's a strong signal to us, since we shouldn't design something out of step with industry expectations / norms.

Supposing that we can't think of enough examples to make a strong case, our options for supporting environment variable overrides include:

Define semantics for merging the existing environment variable scheme into a configuration model. @trask proposed one option based on principle of least surprise (which I've drafted into PR #3840), but there are other semantics possible. I think we can all agree that if we take this path there will be cases where users are surprised by the merge semantics, but this may be ok.
Invent a new environment variable scheme specifically for overriding file configuration properties. Use a well-defined set of rules for deriving the environment variable name which selects any particular property. State that the old environment schema is ignored when file configuration is used. In the long term, the user experience will be less surprising, since there are clean, consistent rules for environment variable overrides. But in the short term, there will be pain because of the hard cutover required to migrate to file configuration.

Jan 24 '24 21:01 jack-berg

Invent a new environment variable scheme specifically for overriding file configuration properties. Use a well-defined set of rules for deriving the environment variable name which selects any particular property. State that the old environment schema is ignored when file configuration is used. In the long term, the user experience will be less surprising, since there are clean, consistent rules for environment variable overrides. But in the short term, there will be pain because of the hard cutover required to migrate to file configuration.

I would strongly suggest following option 2, as it makes it clear what the expectations of specific variables are when interoperating with a configuration file. I would suggest we state that using existing env variables with config creates an unspecified state, since currently implementations are somewhat doing different things to implement support for these.

I think that merging existing env var schema overtop of the configuration makes it difficult for end users to remember (if i use this variable such thing happens, otherwise this other thing) and makes it tricky for implementations as well.

Inventing a new set of variables isn't ideal, especially since so many of the variables that exist today are marked stable and we'd likely have to support them until 2.0, but at least its easier to understand what the repercussion of using these new variables would be in this context.

Jan 24 '24 21:01 codeboten

I would strongly suggest following option 2, as it makes it clear what the expectations of specific variables are when interoperating with a configuration file. I would suggest we state that using existing env variables with config creates an unspecified state, since currently implementations are somewhat doing different things to implement support for these.

Nice! Let's dynamically generate the environment variables from the configuration file :sunglasses:

Jan 24 '24 21:01 ocelotl

Food for thought @codeboten, @ocelotl and anyone else interested the proposal to introduce a new environment variable scheme where keys are generated from the schema:

https://docs.google.com/document/d/1yPfdf6fsxWY7onWU_PLmIIPs14H6pTYbyI7OboiODCw/edit

The solution is not without problems...

Jan 25 '24 23:01 jack-berg

I think you're one step away from making a turing complete environment variable scheme

Jan 26 '24 10:01 jackshirazi

The google docs document is interesting and proposes a set of important rules.

I don't mind having a file centric naming schema but do we really need to change the name of most env vars defined in the current environment variables?

I would prefer to have started with the definition of a "configuration" object based on the current environment variables to ensure backwards compatibility, where possible.

I also believe a "configuration" object works better as the foundation for environment variables, file configs and other possible configuration sources, like frameworks that already have their own configuration schema and need to integrate with the OTel SDK.

The "configuration" object should accept attributes from different sources and perform the merge there. We are probably missing a component here and trying to assign to much scope to the file config.

Yes, we should merge environment variables, but not implemented in the file config.

Maybe the OTel spec shouldn't defined a wide set of attribute names, but the naming rules, merging strategy, sources and priority between sources should be the focus of it.

Jan 26 '24 11:01 brunobat

opentelemetry-specification opentelemetry-specification copied to clipboard

Should file configuration merge environment variable configuration?

Simple Priority

Conflict example 1

Conflict example 2

Conflict example 3

Simple Priority

Conflict example 1

Conflict example 2

Conflict example 3

Conflict example 1

Conflict example 2

Conflict example 3

opentelemetry-specification
opentelemetry-specification copied to clipboard