powertools-lambda-typescript icon indicating copy to clipboard operation
powertools-lambda-typescript copied to clipboard

All: Undefined environment variable returning empty string from EnvironmentVariablesService

Open alan-churley opened this issue 4 years ago • 6 comments

Description of the feature request

Problem statement Currently the EnvironmentVariablesService returns an empty string if an env var is not set, rather than undefined.

Ticket to discuss the pros and cons of this approach and consider changing the logic to use undefined

Summary of the feature

Code examples

Benefits for you and the wider AWS community

Describe alternatives you've considered

Additional context

Related issues, RFCs

This issue should be addressed after: #484

alan-churley avatar Aug 10 '21 09:08 alan-churley

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jan 09 '22 14:01 stale[bot]

Would like to pick up again the discussion about this topic to see if we want to change the current behaviour or not. In case we decide to change I'm available to pick up the task.

The outcome of this discussion should be an agreement on whether EnvironmentVariablesService.get() should return an empty string (i.e. '') or undefined when a given environment variable is not set.

Context

The example below is from packages/logger/src/config/EnvironmentVariablesService.ts, but a similar implementation is found in Tracer and Metrics:

/**
 * Class EnvironmentVariablesService
 *
 * This class is used to return environment variables that are available in the runtime of
 * the current Lambda invocation.
 * These variables can be a mix of runtime environment variables set by AWS and
 * variables that can be set by the developer additionally.
 *
 * @class
 * @extends {ConfigService}
 * @see https://docs.aws.amazon.com/lambda/latest/dg/configuration-envvars.html#configuration-envvars-runtime
 * @see https://awslabs.github.io/aws-lambda-powertools-typescript/latest/#environment-variables
 */
class EnvironmentVariablesService extends ConfigService {

  // Reserved environment variables
  private xRayTraceIdVariable = '_X_AMZN_TRACE_ID';

  /**
   * It returns the value of an environment variable that has given name.
   *
   * @param {string} name
   * @returns {string}
   */
  public get(name: string): string {
    return process.env[name]?.trim() || '';
  }

  /**
   * It returns the value of the AWS_REGION environment variable.
   *
   * @returns {string}
   */
  public getAwsRegion(): string {
    return this.get(this.awsRegionVariable);
  }

  /**
   * It returns the value of the POWERTOOLS_SERVICE_NAME environment variable.
   *
   * @returns {string}
   */
  public getServiceName(): string {
    return this.get(this.serviceNameVariable); // <- this.serviceNameVariable is inherited from `ConfigService`
  }
}

EnvironmentVariablesService is used throughout the utilities to influence the behaviour of the instantiated utility.

Logger

For example (link):

/**
   * It adds important data to the Logger instance that will affect the content of all logs.
   *
   * @param {string} serviceName
   * @param {Environment} environment
   * @param {LogAttributes} persistentLogAttributes
   * @private
   * @returns {void}
   */
  private setPowertoolLogData(
    serviceName?: string,
    environment?: Environment,
    persistentLogAttributes: LogAttributes = {},
  ): void {
    this.addToPowertoolLogData(
      {
        awsRegion: this.getEnvVarsService().getAwsRegion(),
        environment:
          environment ||
          this.getCustomConfigService()?.getCurrentEnvironment() ||
          this.getEnvVarsService().getCurrentEnvironment(),
        sampleRateValue: this.getSampleRateValue(),
        serviceName:
          serviceName || this.getCustomConfigService()?.getServiceName() || this.getEnvVarsService().getServiceName() || Logger.defaultServiceName,
        xRayTraceId: this.getEnvVarsService().getXrayTraceId(),
      },
      persistentLogAttributes,
    );
  }

In the example above some keys might end up having an empty string as value, these keys are then filtered out before emitting the log (here) with the following logic:

pickBy(attributes, (value) => value !== undefined && value !== '' && value !== null);

Even if EnvironmentVariablesService.get() returned undefined instead of empty string this logic would stay as it is since it still must be able to filter out values that might come from elsewhere (i.e. user-provided).

Metrics

In Metrics this service is used less heavily and is used only to set service and namespace in packages/metrics/src/Metrics.ts:

private setNamespace(namespace: string | undefined): void {
    this.namespace = (namespace ||
      this.getCustomConfigService()?.getNamespace() ||
      this.getEnvVarsService().getNamespace()) as string;
  }

private setService(service: string | undefined): void {
    const targetService = (service ||
      this.getCustomConfigService()?.getService() ||
      this.getEnvVarsService().getService()) as string;
    if (targetService.length > 0) {
      this.addDimension('service', targetService);
    }
  }

In the case of setService() the if statement would change but remain similar in essence.

In the case of setNamespace() there would be no change except the type casting and the usage of this.namespace elsewhere would stay the same (below current implementation):

return {
      _aws: {
        Timestamp: new Date().getTime(),
        CloudWatchMetrics: [
          {
            Namespace: this.namespace || DEFAULT_NAMESPACE,
            Dimensions: [dimensionNames],
            Metrics: metricDefinitions,
          },
        ],
      },
      ...this.defaultDimensions,
      ...this.dimensions,
      ...metricValues,
      ...this.metadata,
    };

If this.namespace was undefined instead of empty string the this.namespace || DEFAULT_NAMESPACE would be evaluated the same way (DEFAULT_NAMESPACE is picked).

Tracer

In Tracer the EnvironmentVariablesService is used exclusively upon initialisation of the class to set configs and its values are always evaluated after more explicit configs (i.e. passed in constructor or custom config service). For example (here):

/**
   * Setter for `captureResponse` based on configuration passed and environment variables.
   * Used internally during initialization.
   */
  private setCaptureResponse(): void {
    const customConfigValue = this.getCustomConfigService()?.getTracingCaptureResponse();
    if (customConfigValue !== undefined && customConfigValue.toLowerCase() === 'false') {
      this.captureResponse = false;

      return;
    }

    const envVarsValue = this.getEnvVarsService()?.getTracingCaptureResponse();
    if (envVarsValue.toLowerCase() === 'false') {
      this.captureResponse = false;

      return;
    }
  }

If EnvironmentVariablesService returned undefined instead of empty string then the second if statement in the example above would have to include an additional envVarsValue !== undefined but other than that it would remain the same in essence.

To Sum Up

process.env.SOMETHING would return undefined if the SOMETHING environment variable is not set. In our implementation we are instead returning an empty string.

Do we want to leave the implementation as it is or change it and return the actual value? What would be the advantage in doing so?

At the moment I can't think of any good reason to not leave it as it is but I'd like to hear other opinions.

dreamorosi avatar Jan 25 '22 13:01 dreamorosi

Thanks for the thorough context.

Thoughts:

  1. I don't have a strong opinion on undefined vs. ''. To me, undefined is the correct choice. The environment variable may intentionally be ''.
  2. I'm quite concerned on having 3 similar implementations with the same name in 3 packages instead of only one in common. If they are really different and should be in different packages, shouldn't they have their own names?

image

The concern stems from duplications and slightly different implementations. As these classes deviate over time (and we have more utilities), it can introduces bugs. undefined and '' is a good example. Contributors who worked on another utility may use the class with incorrect assumption.

ijemmy avatar Jan 25 '22 14:01 ijemmy

Assigning priority:medium to this issue as we should tackle #484 first. Note that this issue is still open for triage.

saragerion avatar Jan 25 '22 17:01 saragerion

I don't see the EnvironmentVariablesService listed anywhere in the docs. Is this an internal API?

My preference would be to return undefined so you can tell when the variable hasn't been set versus when it was set to an empty string.

We could add a second optional parameter defaultValue so it becomes:

public get(name: string, defaultValue?: string): string {
  return process.env[name]?.trim() ?? defaultValue;
}

buggy avatar Aug 17 '22 13:08 buggy

Hi @buggy, yes it's an internal API and as such is not represented in the docs.

Here's an example of this class being used in the Tracer utility, the other two have a similar implementation.

As it was pointed out in one of the comments above, in addition to maybe fixing the return type, a prospect resolver should also tackle the deduplication work.

We have a 4th package called @aws-lambda-powertools/commons that already exposes an Utility class that is implemented by all other existing utilities. Ideally this environment logic would belong into that utility or in the commons package, rather than being repeated across utilities.

Contributions are welcome. If you or any future reader is interested, please leave a comment and we'll be happy to provide more details / guidance if needed, or at least review the PRs.

dreamorosi avatar Aug 17 '22 14:08 dreamorosi

Update on the topic:

  • The basic getter for environment variables has already been moved into a shared class now in packages/commons/src/config/EnvironmentVariablesService
  • #1141 introduces another member to check for truthy values (i.e. y, 1, yes, on, true are all resolved to true and likewise, n, 0, no, off, false are all false).

The resolver of this issue will have to standardize the getter in the shared EnvironmentVariablesService (first point) and make sure that the function mentioned in the second point is used where it makes sense.

The changes related to this issue should leave public APIs and default values intact.

As usual, if anyone is interested in picking this up, please leave a comment here, then discuss your proposal before opening a PR.

dreamorosi avatar Nov 09 '22 15:11 dreamorosi

This is an internal API that is now part of the commons package. Currently Logger, Metrics, and Tracer consume this API and based on the returned value decide whether to use it or apply defaults/carry on with the logic.

The current implementation of the EnvironmentVariablesService already handles the undefined which for the sake of the downstream consumers is returned as an empty string. If we were to return an undefined each consumer would have to implement their logic to deal with both values.

Given that the logic is no longer duplicated and that this is ultimately an internal API, I'm going to close this issue as resolved for the time being. If in the future we need to extend the behavior of this class we can revisit the choice and consider any refactor.

dreamorosi avatar Feb 27 '23 14:02 dreamorosi

⚠️ COMMENT VISIBILITY WARNING ⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

github-actions[bot] avatar Feb 27 '23 14:02 github-actions[bot]