argo-cd icon indicating copy to clipboard operation
argo-cd copied to clipboard

feat: ApplicationSet Go template

Open speedfl opened this issue 3 years ago • 25 comments

Closes #9961

Introduction

Purpose of this Pull request is to improve ApplicationSet templating capabilities by using Go text/template.

I am not a Go expert (Java Developer) so don't hesitate if you have remarks on best practices.

Security concerns

billion-laughs attack protection is working: applicationset/generators/cluster_test.go

Non backward compatibility concern

In GitGenerator the following changes occured:

  • path is now a structure, therefore:
    • path representing the full path is now path.path (name could be dicuss)
    • path[n] representing the path segments is now path.segments[n] (name could be discuss)

Note on DCO:

If the DCO action in the integration test fails, one or more of your commits are not signed off. Please click on the Details link next to the DCO action for instructions on how to resolve this.

Checklist:

  • [x] Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
  • [x] The title of the PR states what changed and the related issues number (used for the release note).
  • [x] I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
  • [x] I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
  • [x] Does this PR require documentation updates?
  • [x] I've updated documentation as required by this PR.
  • [ ] Optional. My organization is added to USERS.md.
  • [x] I have signed off all my commits as required by DCO
  • [x] I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • [x] My build is green (troubleshooting builds).

speedfl avatar Jul 18 '22 14:07 speedfl

Codecov Report

Merging #10026 (de72194) into master (3f858f5) will increase coverage by 0.18%. The diff coverage is 81.04%.

@@            Coverage Diff             @@
##           master   #10026      +/-   ##
==========================================
+ Coverage   45.93%   46.11%   +0.18%     
==========================================
  Files         227      227              
  Lines       27419    27550     +131     
==========================================
+ Hits        12594    12705     +111     
- Misses      13114    13132      +18     
- Partials     1711     1713       +2     
Impacted Files Coverage Δ
...is/applicationset/v1alpha1/applicationset_types.go 34.69% <ø> (ø)
applicationset/generators/matrix.go 70.73% <57.14%> (-2.25%) :arrow_down:
applicationset/generators/merge.go 59.04% <57.14%> (-1.96%) :arrow_down:
applicationset/utils/map.go 60.86% <58.33%> (-2.77%) :arrow_down:
applicationset/utils/utils.go 80.50% <75.00%> (+2.09%) :arrow_up:
applicationset/generators/list.go 62.50% <76.47%> (+3.87%) :arrow_up:
applicationset/generators/git.go 85.96% <95.55%> (+2.80%) :arrow_up:
...cationset/controllers/applicationset_controller.go 56.75% <100.00%> (ø)
applicationset/generators/cluster.go 77.90% <100.00%> (+3.83%) :arrow_up:
applicationset/generators/duck_type.go 68.62% <100.00%> (+1.61%) :arrow_up:
... and 5 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov[bot] avatar Jul 18 '22 15:07 codecov[bot]

@crenshaw-dev I will try this

speedfl avatar Jul 18 '22 16:07 speedfl

@crenshaw-dev thanks a lot for your suggestion.

Here are the outcome of a discussion with a colleague:

path object contains:

  • path: the full path (previous path variable)
  • segments: for the list of segments (ex path[n] variables)
  • basename: the previous path.basename.
  • filename: the previous path.filename
  • basenameNormalized: the previous path.basenameNormalized
  • filenameNormalized: the previous path.filenameNormalized

For path function we unfortunately have to make a choice:

  • returning the full path (the previous path parameter)
  • returning the new object path, making all other parameters working (path.basename, path.filename, path.basenameNormalized, path.filenameNormalized)

Making this choice will break the backward compatibility as some field won't work.

Concerning the path[n], it is unfortunately not possible to have a function name path[n] (example path[0]). We will then have to break the backward compatibility.

I would suggest to add this feature as part of an ApplicationSet v1alpha2.

speedfl avatar Jul 18 '22 18:07 speedfl

I think my earlier suggestion was over-complicated.

What if we detected old-style templates and simply upgrade them? This regex should work for path segments:

{{\s*path\[(\d+)\]\s*}}

You can insert the initial ., add .segments, and preserve the index with a matching group.

This regex should work for path.*:

{{\s*path\.([a-zA-Z]+)\s*}}

Just add a . at the beginning and preserve the member name with the matching group.

The replacements can be done on tmplBytes in RenderTemplateParams before the template string is parsed.

The docs can simply state that old-style params will continue to work as long as they're the only part of the template. If someone wants to properly use Go templating, they have to switch to the new syntax.

crenshaw-dev avatar Jul 18 '22 20:07 crenshaw-dev

The git files generator introduces some more complicated cases. I forget how flexible the old flatten logic was, but I believe a regex-based upgrade is still possible.

{{\s*(([a-zA-Z\d]+?(\.[a-zA-Z\d]+))+)\s*}}

(Matches {{ some.dynamic.value }}, making it replaceable with {{ .some.dynamic.value }}.)

crenshaw-dev avatar Jul 18 '22 20:07 crenshaw-dev

@crenshaw-dev really good idea. Let me try it

speedfl avatar Jul 19 '22 12:07 speedfl

If we replace only complex values (ie: some.value) we will miss all variables with a simple word.

Example:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: guestbook
spec:
  generators:
  - list:
      elements:
      - cluster: engineering-dev
        url: https://1.2.3.4
  template:
    metadata:
      name: '{{cluster}}-guestbook'

However, if we try to detect all variables, whatever if there is separation or not (ie above cluster) we will have to whitelist all go template functions. Here is a small list (non exhaustive):

https://github.com/golang/go/blob/9ce28b518d9a792d2e3e741bcb38fa046891906e/src/text/template/funcs.go#L42-L62 https://github.com/golang/go/blob/9ce28b518d9a792d2e3e741bcb38fa046891906e/src/text/template/parse/lex.go#L80-L90

Other considerations:

  • this won't solve the path => .path.path and path[n] => path.segments[n] for which we will have to add a custom logic.

Looking deeper I think this may add lot of corner cases, which can result in lot of regressions for basic templating and reduce functionalities for go template (this will introduce more maintenance)

I would propose 3 solutions (my order of preference top to bottom)

Solution 1

Create a v1alpha2 which use gotemplate.

Pro:

  • Backward compatible
  • Rollout deprecation
  • Easy removal (just duplicate temporarly the code and remove the full folder of v1alpha1 when ready)

Cons:

  • Need to introduce new controller + crds
  • Double maintenance ? It is possible to explain that maintenance will be done only on v1alpha2

Solution 2

Add a field (or an annotation) to activate go templating. This will allow us to have a rollout phase:

  • Phase 1: fasttemplate is still the default but deprecated. Field activate go templating
  • Phase 2: fasttemplate removed. Field to activate go templating not needed anymore

Pro:

  • Backward compatible
  • Rollout deprecation
  • Easy removal (just duplicate temporarly the code and remove the full folder leveraging fasttemplate)

Cons:

  • What happen to this field in the future ?
  • Double maintenance ? It is possible to explain that maintenance will be done only gotemplate or shared code

Solution 3

Try go template. In case of failure backup to fast template. This would have a cost in term of performance. This will imply flattening + handling the path case

Pro:

  • Backward compatible
  • Transparent to user

Cons:

  • Performance issues for big workload
  • Complex maintenance
  • Ugly ?

Solution 4

Detect and hope we don't forget anything

Regexp for complex structure: {{\s*((?!\s*if.*|\s*range.*|\s*define.*|.*-.*|\s*\$|\.).*?\s*(?!-)?)\s*}} (we need to add all protected keywords): https://regex101.com/r/ghzt0b/3

Regexp for path: {{\s*(path(\[\d+\])?)(\.(.*))?\s*}}: https://regex101.com/r/R0ofRv/1

speedfl avatar Jul 19 '22 14:07 speedfl

I still think option 4 is viable. The functions shouldn't be a problem, because as far as I can tell, they all require a space and then at least one argument. They won't match the example regexes. The keyword cases are easily solved by looping over the params and replacing single-word templates only for the params present. It would be up to the user to resolve collisions if they have a param name that overrides a Go template keyword they want to use. But point taken, there are probably a lot of corner cases.

I think of the remaining options, I prefer option 2. It could be done by introducing a goTemplate field alongside the template field.

crenshaw-dev avatar Jul 19 '22 15:07 crenshaw-dev

Last regexp and corner cases: https://regex101.com/r/ghzt0b/4

{{ hello.hello-world }} Should match and Must be transformed into {{ index .hello "hello-world" }}.

We can also do the following logic

if s not matches `{{\s*-?(?:if|range|define|index|with|".*?"|\$|.*?\(.*?\)|\.).*?-?}}` # Search go template matchs
  s = transformInGoTemplate(s)
fi
templateWithFastTemplate(s)

Here is the logic to track Go Template ( https://regex101.com/r/lT8k6I/3) we need to add all protected gotemplate keywords.

To tranform in go template it just consist in changing (sequentially):

  • all occurence of path[n] to .path.segments[n] with {{\s*path\[(.*)\]\s*}} to {{ .path.segments[${1}] }}
  • all occurence of path to .path.path with {{\s*path\s*}} to {{ .path.path }}
  • all occurence of anything.with.or.without.dot to .anything.with.or.without.dot with {{\s*(?!\.).*?\s*}} to {{ .${1} }}

I maintain it will bring too much complexity and risk. Solution 1 and 2 are:

  • Safer
  • Better in term of performance (regexp can be really costly on big strings)
  • More maintainable (regexp can be really complex to maintain to cover all cases)

Concerning solution 2,based on the choice it will imply crd updates.

Keep me posted

speedfl avatar Jul 19 '22 16:07 speedfl

@crenshaw-dev Here is what I can propose with legacy handling and transformation (with unit tests)

https://github.com/speedfl/argo-cd/commit/29c42bde998cae26e4a26d5e279dc9ee346f8790

If you are ok I can update the documentation + add some integration tests

speedfl avatar Jul 20 '22 13:07 speedfl

Extremely well researched, thanks!

I agree with your hesitations about the regex implementation. Option 2 seems very reasonable to me.

Concerning solution 2,based on the choice it will imply crd updates.

That's true. Across Argoproj, we've generally been okay with adding fields without versioning the CRD, so long as we don't subtract fields.

Adding a goTemplate field would leave us stuck maintaining the old template field in v1alpha1. But I think that's a manageable cost.

crenshaw-dev avatar Jul 20 '22 13:07 crenshaw-dev

@crenshaw-dev I did all the adaptation.

  • I ensured backward compatibility by putting back all unit tests & e2e tests with fasttemplate. There is a bit of test duplication but it was on purpose. When fasttemplate will be decommissioned we will just have to delete the "non go template" tests without doing chirurgical changes.
  • I updated the documentation by adding a GoTemplate section
  • I created all the examples in GoTemplate

speedfl avatar Jul 21 '22 13:07 speedfl

This is looking awesome!

Do we have tests for handling quotation marks in parameter values? I know the fasttemplate implementation had logic to handle escaping quotes (and other JSON control characters) before interpolating output back into the JSON blob. Is that being handled with the go template implementation?

crenshaw-dev avatar Jul 21 '22 14:07 crenshaw-dev

@crenshaw-dev yes I keep the logic for fast template: https://github.com/argoproj/argo-cd/blob/master/applicationset/utils/utils.go#L76-L87

I just removed the allowUnresolved as looking at the code non are using allowUnresolved=false. In addition I factorized the code between what we have in RenderTemplateParams and in Replace

https://github.com/argoproj/argo-cd/blob/master/applicationset/generators/generator_spec_processor.go#L109-L110 https://github.com/argoproj/argo-cd/blob/master/applicationset/generators/cluster.go#L175-L176 https://github.com/argoproj/argo-cd/blob/master/applicationset/generators/cluster.go#L175-L176 https://github.com/argoproj/argo-cd/blob/master/applicationset/utils/utils.go#L40-L41

Concerning the GoTemplate, special characters are already handled.

If you are outside {{}} you are good. However if you want to use it inside you need by convention use ` character.

template.Must(template.New("").Parse("{{`{{value}}`}}"))

speedfl avatar Jul 22 '22 14:07 speedfl

@speedfl I'm more concerned about what happens when values contain quotes, since we're templating over a JSON blob. Consider this case:

template.Must(template.New("").Parse(`{"key": "{{.value}}"}`))

If .value == ", then JSON unmarshaling will fail.

The fasttemplate implementation resolved that by doing strconv.Quote(value)[1:-1] to JSON-escape values before inserting them into the template. I'm not sure if go template provides a hook to transform the evaluated value immediately before interpolation.

crenshaw-dev avatar Jul 22 '22 15:07 crenshaw-dev

@crenshaw-dev you are right. It is blocking. Well seen. I added a unit test and working on the fix :)

speedfl avatar Jul 22 '22 17:07 speedfl

@crenshaw-dev I was able to fix this issue. I added a beautiful unit test in utils_test.go :D

{
	name:        "Index",
	fieldVal:    `{{ index .admin "admin-ca" }}, \\ "Hello world", {{ index .admin "admin-jks" }}`,
	expectedVal: `value "admin" ca with \, \\ "Hello world", value admin jks`,
	params: map[string]interface{}{
		"admin": map[string]interface{}{
			"admin-ca":  `value "admin" ca with \`,
			"admin-jks": "value admin jks",
		},
	},
},

It tests that:

  • " between {{}} are not escaped as they are used for templating functions
  • " outside {{}} are escaped (to not break unmarshall)
  • " in params are escaped correctly (to not break unmarshall)

speedfl avatar Jul 25 '22 14:07 speedfl

@speedfl I'm not sure pre-processing the params will be sufficient. The output of the go expression can still be un-escaped. For example:

		{
			name:        "quote",
			fieldVal:    `{{ slice .quote 1 }}`,
			expectedVal: ``,
			params: map[string]interface{}{
				"admin": `"`,
			},
		},

That test produces this:

        	Error:      	Received unexpected error:
        	            	unexpected end of JSON input

crenshaw-dev avatar Jul 25 '22 19:07 crenshaw-dev

I have a question. What is the motivation of doing this json marshall before and not doing on all fields of the Application instead ? Going through all fields would be more secure I think

speedfl avatar Jul 25 '22 19:07 speedfl

Going through all fields would be more secure I think

Honestly this would very much be my preference. Just recurse over the template and run substitution for each string field.

I think it might be a bit slower. But so far that hasn't really been a problem.

crenshaw-dev avatar Jul 25 '22 19:07 crenshaw-dev

@crenshaw-dev I updated the code with recursive updates. It simplified a lot the code in fast template as well by suppressing the need of quoting and slicing

speedfl avatar Jul 25 '22 20:07 speedfl

That's awesome! Mind running make lint-local to clear that check (I'm thinking the other integration test failure is just a flaky test).

crenshaw-dev avatar Jul 26 '22 13:07 crenshaw-dev

@crenshaw-dev all good !

speedfl avatar Jul 27 '22 11:07 speedfl

@speedfl apologies for the slow review. Taking a look today. This would be awesome to have in 2.5. :-)

crenshaw-dev avatar Aug 01 '22 14:08 crenshaw-dev

@speedfl I made a few of the changes I requested and put them here: https://github.com/argoproj/argo-cd/pull/10163

For some reason it wouldn't let me open a PR directly against your branch.

crenshaw-dev avatar Aug 01 '22 19:08 crenshaw-dev

@crenshaw-dev I added you as collaborator on my fork. You can now submit a PR

speedfl avatar Aug 02 '22 11:08 speedfl

@crenshaw-dev I see that CI failed but it looks like a temporary issue. Could you please relaunch it ?

speedfl avatar Aug 03 '22 12:08 speedfl

@crenshaw-dev I solved merge conflicts

speedfl avatar Aug 04 '22 14:08 speedfl

@crenshaw-dev it seems that there is a problem in integration tests. Could you please check ?

speedfl avatar Aug 04 '22 18:08 speedfl

@speedfl they've gotten flakier recently, and I haven't had time to figure out why. Re-triggering.

crenshaw-dev avatar Aug 04 '22 18:08 crenshaw-dev