spinnaker icon indicating copy to clipboard operation
spinnaker copied to clipboard

Proposal - Support Overrides for Task Definition artifacts with SpEL expressions

Open paragbhingre opened this issue 4 years ago • 3 comments

Issue Summary:

This issue is a proposal to support specifying overrides for task definition artifacts in Amazon ECS

Cloud Provider(s):

ECS

Description:

Overview

This issue is a proposal to support specifying overrides for task definition artifacts in Amazon ECS. The existing functionality of task definition artifacts will continue to work as is. In addition to existing functionality, user will be able to dynamically change the parameter values in their task definition artifacts.

Background

Currently, when using task definition from external sources (e.g. GitHub) in the form of artifacts, it comes in static JSON format. This requires users to create, update and store unique task definition artifacts per server group, and restricts users from reusing one artifact with differing parameters. There is a workaround to change secrets and environment variables dynamically using AWS Systems Manager Parameter Store and AWS Secrets Manager Secrets which indeed solves few use cases. But this workaround doesn’t work for all the parameters in task definition.

Issues caused due to inability to dynamically change parameter values in task def JSON when using artifacts

  • #4622
  • #6113

Requirements

Provide customers with a way to specify per-server group override values for task definition properties when using an artifact. This will allow parameter values inside task definition artifact to be updated dynamically before submitting a task to AWS.

Current state

Currently Orca processes pipeline (SpEL) expressions during pipeline deployments to ECS.

image (1)

Every orchestrate request goes through API gateway (i.e. Gate) to ORCA where pipeline goes thorough SpEL processing. Also, resolved task definition artifact will be prepared before sending the request to create server group in clouddriver. Clouddriver then uses resolved artifact details to download the artifact and create your server group. In short, task definition artifact never passes through SpEL processing (one way of dynamically populating the values) in the entire life cycle of the pipeline deployment.

The downside of this implementation is that it requires a task definition artifact per server group and users can’t change values of the containers dynamically.

New Behavior (Proposed)

New implementation will support SpEL processing on task definition artifact as well. This will allow users to embed pipeline expressions in their task definitions.

This implementation will give you an opportunity to choose if your task definition artifact should go through SpEL processing or not using evaluateTaskDefinitionArtifactExpressions flag.

Untitled (1) (1)

Through new implementation, ECS deploy stage will send pipeline json to API Gateway (i.e. Gate) that will call ORCA to orchestrate your pipeline and download the artifact for SpEL processing based on the evaluateTaskDefinitionArtifactExpressions flag value.

The flag behavior is as followed:

  • [evaluateTaskDefinitionArtifactExpressions] = true, SpEL expressions are evaluated on artifacts
  • [evaluateTaskDefinitionArtifactExpressions] = false (default), SpEL expression evaluation will be skipped for artifacts

Pros of having evaluateTaskDefinitionArtifactExpressions flag is that we will reduce the overhead of processing task definition artifact every time even when users are not using SpEL expressions in their artifacts. Also, using this flag makes sure that we are being additive in terms of features and not changing the default workflow at all.

If you choose to skip the SpEL processing for the task definition artifact then you will go through previous workflow that is current implementation. Otherwise, your SpEL processed pipeline and task definition artifact will be sent to Clouddriver for create server group action.

Pros

  • Easy to adapt - This implementation is consistent with community implementations for override per containers or dynamically changing the server group values. Also, Spinnaker ECS already uses SpEL processing on pipeline JSON, so it would be easy for task def artifacts also to adapt SpEL.
  • Rich feature support - SpEL processing is powerful and supports querying and inserting values at runtime.
  • Would address https://github.com/spinnaker/spinnaker/issues/6113

Cons

  • Overhead - New users are required to learn SpEL before start using this feature and add it to their task definitions to use them.

User experience of the new workflow

  • Configure your task definition artifact to have pipeline expressions as per your need. for example

Screen Shot 2021-03-03 at 9 55 55 AM

  • In pipeline configuration stage add parameters and their values. In this case the parameter is awslogs-group-deploy and value would be spinnaker-demo.

Screen Shot 2021-03-03 at 9 57 33 AM

  • After pipeline expressions evaluation in ORCA task definition artifact would look like:

Screen Shot 2021-03-03 at 9 58 39 AM

Alternative option considered

Another way to allow support for per container override would be to have a new field(s) on stage JSON that can carry the values of the specified parameters.

Option 2 (1)

This solution would allow users to specify fields that they want to override in task definition artifact and their new values.

Potential UX of workflow

  • Here you can choose a value of a parameter that should get replaced at run-time

Screen Shot 2021-03-09 at 12 51 03 PM

Pros

  • Straightforward to use in most cases - users are just required to add fields and their values that they want to set dynamically for the given container name. No overhead of knowing new expression language i.e., SpEL.

Cons

  • Complicated to implement - It will require new complicated data structure(s) to accommodate this feature and perhaps not even work for certain values in the task def, which is a deeply nested structure.
  • Non-standard for Spinnaker - Would be a method of applying overrides that is specific to the ECS provider, vs. in-line with other providers in Spinnaker

Recommendation

We should go with the first option (pipeline expression support) because:

  • It satisfies many scenarios, like log group, and environment variables, which community members want to use dynamic values for in task definitions.
  • SpEL comes with a lot of facilities in terms of helper properties and functions and those can be used to fetch a value of a parameter from multiple contexts such as expressions, triggers and parameters etc.
  • Implementation of an alternative option (i.e. using field mappings) would be specific to ECS and does not align with the other providers in Spinnaker.
  • Also, SpEL evaluation is less complicated to implement as compared to mapping overrides to specific fields, which would not practically support overrides to specific, deeply nested fields or members of collections (e.g., linuxParameters.capabilities.add[1]...)

Open Questions

  1. Is there any specific scenario that community members are implementing or do want to implement in the future but that the recommended option won’t address?
  2. Is there any overhead to explicitly opt-in for SpEL evaluation of task def artifact by setting evaluateTaskDefinitionArtifactExpressions flag to true? Does anyone prefer SpEL evaluation to happen always?

paragbhingre avatar Mar 16 '21 23:03 paragbhingre

CC @away168, @iniinikoski, @rrijkse

paragbhingre avatar Mar 17 '21 16:03 paragbhingre

I think I would prefer Option 1 too, since it is most Spinnaker native. Although Option 2 could be easier for users and useful if you have a way to parse a multilevel key (e.g. parent.child[0].item), and would follow a Bake (manifest) - Helm values override approach.

I think defaulting the flag to evaluateTaskDefinitionAritifactExpressions to false also makes sense - since it is easier for the beginner and simple use case to use.

away168 avatar Mar 17 '21 21:03 away168

I think I would prefer Option 1 too, since it is most Spinnaker native. Although Option 2 could be easier for users and useful if you have a way to parse a multilevel key (e.g. parent.child[0].item), and would follow a Bake (manifest) - Helm values override approach.

I think defaulting the flag to evaluateTaskDefinitionAritifactExpressions to false also makes sense - since it is easier for the beginner and simple use case to use.

Agree with @away168 here - a vote for Option 1. Thank you for building the proposal @paragbhingre !

iniinikoski avatar Mar 30 '21 13:03 iniinikoski