kpt icon indicating copy to clipboard operation
kpt copied to clipboard

Design and implementation of "built-in" functions

Open frankfarzan opened this issue 4 years ago • 5 comments

Currently, there is no notion of built-in functions. The first-party function images in gcr.io/kpt-fn have no special meaning or treatment and are released and versioned independently from the kpt CLI. This is by design and has many benefits:

  • We want kpt CLI to be a small, stable, core that provides extension points. In contrast, there will be an ever growing catalog of functions with a higher release cadence.
  • We don't want the complexity of having separate mechanisms for "core" vs "custom" functions (Lessons learned from k8s api machinery).

One disadvantage of this approach (See https://github.com/GoogleContainerTools/kpt/issues/1838#issuecomment-831907230) is that the user is exposed to, and needs to think about, version management for even basic functions. To address this, we can provide a porcelain on top of the existing functionality without creating fundamentally different mechanisms under the hood.

Strawman:

In addition to supporting fully specified function images e.g.:

gcr.io/kpt-fn/set-labels:v1

We can support a notation for functions in the gcr.io/kpt-fn registry, e.g.:

set-labels

This will resolve to a container image with a fully specified semver, e.g.:

gcr.io/kpt-fn/set-labels:v1.2.3

This resolution is done by lookin up manifest of functions baked into the CLI, e.g.:

"set-labels": "v1.2.3"
"apply-setters": "v2.3.4"

This manifest is is cut with the release of the CLI and contains the latest stable version of each function in kpt-fn registry at the time.

Caveats:

  • The version of the "built-in" function using this notation is tied to the CLI version. Updating to a new version of the CLI will in general be backwards incompatible if the major version of a function has changed. If this is not desired, then the user should provide the version explicitly.
  • This doesn't prevent mixing explicit (e.g. gcr.io/kpt-fn/set-labels:v1.2.3) and built-in (e.g. set-labels) in the same package hierarchy. If the user wants to prevent the former (at the risk of incompatibility with off-the-shelf packages), there needs to be an orthogonal mechanism. See #1904

frankfarzan avatar May 05 '21 18:05 frankfarzan

Related idea: Multiple functions (e.g., set namespace, name, labels, annotations) could be bundled into the same container, though I'm sure it would affect the UX some.

bgrant0607 avatar Jun 25 '21 20:06 bgrant0607

In addition to setters, common kustomize-style transforms (names, namespaces, labels, annotations), and common validation functions, the Starlark runtime is a prime candidate as a built-in function also.

bgrant0607 avatar Sep 09 '21 23:09 bgrant0607

To address this, we can provide a porcelain on top of the existing functionality without creating fundamentally different mechanisms under the hood.

I actually think that having a special in-process mechanism for running Starlark functions would be advantageous.

I know it used to be in-process and was moved to a container like all the other functions, but the current strategy of having all functions be in a container has some negative consequences:

  • makes kpt slow (docker pull, docker run)
  • requires docker (which now costs money for large business to use on Mac & Windows)
  • makes function authoring more costly than it needs to be (build, version, publish, etc).

If we make Starlark functions a special type of function that run in-process it would have the following benefits:

  • Faster end to end execution for kpt fn render and kpt fn eval
  • Pull Starlark KRM Functions from your own K8s cluster!
  • Share Starlark KRM Functions with other tools, like webhooks (Policy Controller) and controllers (Config Sync)
  • Pull Starlark KRM Functions from a URL (like github, or a local nginx), instead of just a container registry. Pulling from the internet might be a security risk, but from a local nginx with a directory full of functions would make it trivial for an org to make their own function library managed by GitOps.
  • "built-in" functions could just be a local directory full of Starlark KRM or a container of Starlark KRM, independently updateble from kpt itself. OR we could just bundle them into kpt itself as files in the binary, with something like https://github.com/go-bindata/go-bindata. That way the published functions and built-in functions can use the same exact source code.

We could even integrate with Cloud Code so that syntax highlighting and IDE features work when writing Starlark in KRM.

In this world, functions would be pretty trivial to write, could be used both client-side and server-side, and would even be usable for validating and mutating webhooks. You could use the same validation functions in kpt, Config Sync, and Policy Controller. This would really unify the Anthos Config Management story around a single ecosystem of functions for both config and policy.

karlkfi avatar Sep 10 '21 00:09 karlkfi

I think after being in the market for a year and getting feedback and comparison to existing tools like kustomize I would advise the team to build the basic functions in. Here are some thoughts for why building them in makes sense:

  1. Docker licensing has changed
  2. Docker is much slower than in-binary execution.
  3. The churn in basic transformers has probably slowed down a bunch.

Here are my candidates for built in (this is a subset of kustomize):

  1. Annotations
  2. Labels
  3. Namespace
  4. Prefix
  5. Suffix
  6. Imagetag
  7. ReplicaCount

I think these are generally useful and the behavior is well understood.

mikebz avatar Aug 26 '22 18:08 mikebz

@mikebz Yes, it's a good idea to have some common functions built into Kpt and totally agree with those 3 reasons you listed. I saw you've called it out several times and here's more detailed reasons why it's not planned:

  1. We are more focusing on the WYSIWYG, which prioritizes the Porch UI experience over kpt CLI. On the porch UI side, we already have some functions builtin (starlark, set-namespace and replacement) and get the benefit.
  2. Most of the functions you listed are not quite mature yet, especially prefix/suffix. Other functions needs to add more features (e.g. https://github.com/GoogleContainerTools/kpt/issues/3444) and need better first-class identifier transformer support.
  3. I prioritize the function and SDK work according to our WYSIWYG UI critical paths. And the function execution time hasn't been a big blocker yet. While maintaining a up-to-date builtin function list is a big overhead. And we've spend many effort on it in May. It seems not to be a good timing to add the support in Kpt CLI.

yuwenma avatar Aug 26 '22 20:08 yuwenma