kpt icon indicating copy to clipboard operation
kpt copied to clipboard

Fixing updates in porch

Open mortent opened this issue 2 years ago • 7 comments

We are currently discussing two different alternatives for doing updates in porch. It is the current merge behavior from kpt (which is roughly equivalent to how we can do a merge commit in git) and the reclone-and-replay approach (which roughly corresponds to rebase in git). We are still discussing which option is the best long term, with a possible solution being that we support both.

Unfortunately our current porch API and implementation of update is a mix of both. Users specify the new upstream through the CloneTask, even though we actually do merge updates. This leads to very strange task lists, in particular since the updateMutation adds CloneTasks to the history. This is obviously confusing for further updates (which clone task should be used?) and for replaying tasks for creating new revisions (since the actual tasks used isn't accurately represented in the task list).

Luckily I think our API with tasks provide a simple and intuitive way to handle both ways of updating a package:

For doing a merge update (the only type of update we are currently supporting), users should add an UpdateTask to the task list and pass the PackageRevision to the Update endpoint of the API. The UpdateTask will include a reference to the new Upstream and potentially the merge strategy that should be used. The UpdateTask will map directly to the updateMutation, that will generate a commit with the changes that corresponds with the UpdateTask. The updated package will have a task list that accurately describes the mutations of the package, and it is possible to replay the tasks to end up at the same.

For doing a reclone-and-replay, the correct way to specify this through the API is to update the original CloneTask. This will then cause the update to happen by recloning the package with the new version and then replay the tasks (possibly multiple times for the iterative version of replay updates). This will not lead to any UpdateTask in the task history of the revision, and this makes sense since there haven't actually been a merge, just recloning and replaying the update. Again, the resulting PackageRevision can be recreated by replaying the tasks.

Since we are still exploring the replay updates, our first goal should be to fix the API to allow for the merge updates. This means creating and handling UpdateTasks. This will give us updates equivalent to kpt in porch. Then we can continue to explore reclone-and-replay, and we can add support for this without impacting the merge updates.

mortent avatar Jun 25 '22 23:06 mortent

This makes sense to me. @droot and I were chatting and we discussed the idea of surfacing another field, likely the merge strategy, and exploring both options that way. But I do agree that we can better represent what's actually happening by doing what you suggest: allowing updates to CloneTask trigger reclone-and-replay, adding an UpdateTask triggers the merge.

The advantage of your proposal here over an explicit field is that if/when we feel confident that one way is better than the other, at that point we simply disallow either editing the CloneTask or adding the UpdateTask, we aren't left with a vestigial updateStrategy field that can only have one value.

justinsb avatar Jun 27 '22 14:06 justinsb

Sounds very reasonable, so +1.

Few things come to mind:

  • Would be good if we can detect and reject update requests if client end up mixing up these two modes. I think a simple check would be reject updating the clone Task if updateTask exists in the task list.
  • Currently updateMutation retrieves the original version of the package from the old cloneTask, Now the logic might be to look at the last updateTask (0 or many may exists, if no updateTask exists, then probably picking one from last cloneTask)

droot avatar Jun 27 '22 17:06 droot

Yeah, so doing a merge update after doing replays should be fine. With the current task model and replay, there isn't any difference between a package created from v1 and then updated to v2, and a package created initially from v2. I agree that trying to replay a package that has been updated in the past doesn't seem to make much sense, although I'm not 100% sure there can't be valid use-cases.

The logic for handling the tasks in the package revision update function will need to change. Essentially there should be two options:

  • A new UpdateTask has been appended to the end: This means we should do a merge update and the new upstream will be specified in the UpdateTask itself.
  • The upstream reference in the clone task has been updated: This means a replay and shouldn't involve the updateMutation at all. This should not lead to any additional commits or tasks other than the ones that are there already.

mortent avatar Jun 27 '22 18:06 mortent

Yeah, looks good to me.

droot avatar Jun 27 '22 20:06 droot

It sounds like the replay approach to updates wouldn't be supported by the kpt CLI?

bgrant0607 avatar Jun 30 '22 15:06 bgrant0607

It sounds like the replay approach to updates wouldn't be supported by the kpt CLI?

I don't see any reason why we couldn't support it in the CLI. kpt alpha rpkg update could have an additional flag to determine which update to do.

natasha41575 avatar Jun 30 '22 15:06 natasha41575

Some thoughts on update behaviors:

  • Blueprints should effectively provide default values. That means downstreams should be able to override them. Changes to blueprint values are then like changing defaults, which arguably is a non-backwards-compatible change, but should not clobber overrides regardless.
  • Blueprints also provide recipes (aka factories) for creating variants. However, by definition, variant-constructor functions need to be able to take their variant-specific inputs from somewhere other than the blueprint, namely from the downstream packages.
  • Admins should be able to specify validating and mutating admission control functions that apply to all downstream packages created from upstream packages and cannot be overridden or removed. I'll be fuzzy for now about whether that's at the repo level or package level. #3279 and #3218 cover this.
  • Downstream consumers should be able to add their own admission control functions, but the mandated functions specified upstream should have the last word.
  • The resources output by a generator need to run through full admission control.
  • It should be possible to re-run upgraded functions, such as for bulk function upgrades #3309

Some concrete update scenarios I've run into:

  • I created a blueprint and forgot to add one of the variant-constructor functions. I realized it only when I created a deployment from it. (Testing workflows are a whole other issue.) I wanted to add it to the blueprint, then update the deployment.
  • I added new resources or attributes to the blueprint that I wanted deployments to inherit.
  • We updated set-namespace and I wanted to update my blueprints and all deployments to the new version.
  • Someone may want to update an image tag or digest and promote it across environments via update. #3280
  • Multi-cluster rollouts would be similar: change a multi-cluster blueprint, then do a bulk upgrade to cluster-specific variants and gradually sync to the new revisions.

In all of those cases, I'd expect the blueprint changes should not clobber downstream customizations.

bgrant0607 avatar Jun 30 '22 16:06 bgrant0607

Closing this issue as the API design suggested in this issue was implemented in https://github.com/GoogleContainerTools/kpt/pull/3338 and https://github.com/GoogleContainerTools/kpt/pull/3336

mortent avatar Mar 27 '23 22:03 mortent