cli-utils icon indicating copy to clipboard operation
cli-utils copied to clipboard

Tech Debt: Rename prune/delete timeout to reconcile timeout

Open karlkfi opened this issue 4 years ago • 11 comments

As a new user to this code base (through kpt & Config Sync), I was initially really confused about the difference between destroy, prune, and delete. And as a contributor, I have a hard time documenting/explaining the waiting behavior after deletion.

Solution:

  • Rename prune to "delete". This will make the applier & destroyer more consistent and remove ambiguity between the two.
  • Rename prune/delete timeout to "elimination timeout". This will parallel nicely with "reconciliation timeout", I think.

Both of these changes would be interface changes, for options and events, but I think the improved clarity will be worth it.

Destroy is also another synonym for delete, but in this context it means to delete a whole inventory, so we don't need to change that. I think it's easy to describe how destroy is different from delete/prune/elimination.

/cc @seans3 @mortent

karlkfi avatar Oct 18 '21 19:10 karlkfi

An alternate option here is to remove DeleteTimeout/PruneTimeout completely and replace them with the existing ReconcileTimeout.

@haiyanmeng @ash2k Would this be acceptable for your use cases or do you need the timeouts to remain separate?

karlkfi avatar Oct 26 '21 18:10 karlkfi

@karlkfi Appreciate the ping.

Conceptually it may make sense for each logically separate "thing"/step to have a separate timeout. I don't know if it's hard or easy to achieve this, haven't checked the code. In practice though my program passes a context to the applier run and cancels it when it needs to stop the applier (when program is terminating or when another set of objects need to be synced instead of the one that is currently being applied).

An alternate option here is to remove DeleteTimeout/PruneTimeout completely and replace them with the existing ReconcileTimeout.

Are you suggesting ReconcileTimeout to cover the whole process or each phase/step would use a separate timeout set to ReconcileTimeout? I think ideally we would have both, but again I don't know if it's difficult/worth the effort.

ash2k avatar Oct 26 '21 23:10 ash2k

We already allow a global applier timeout using context.WithTimeout (which it sounds like you're using already).

The proposal is either: A) Rename DeleteTimeout & PruneTimeout to EliminationTimeout. This would keep EliminationTimeout distinct from ReconcileTimeout. B) Replace DeleteTimeout & PruneTimeout with ReconcileTimeout. This would combine the WaitTask timeouts to all be the same name.

DeleteTimeout & PruneTimeout both come from the WaitTask, one from the destroyer and one from the applier, so the differentiation is already confusing.

Even if we do either of these, there's still a difference between the DeleteEvent (destroyer) and PruneEvent (applier), even though they both come from the PruneTask. I would like to make them both a DeleteEvent for consistency, but I wasn't sure I wanted to make both API changes at the same time.

karlkfi avatar Oct 26 '21 23:10 karlkfi

EliminationTimeout does not sound quite right to me. I would rather we keep DeleteEvent/DeleteTimeout and get rid of PruneEvent/PruneTimeout.

BTW, I don't think this is very high priority.

haiyanmeng avatar Oct 27 '21 00:10 haiyanmeng

Agreed on it being low priority. It was just surfaced as tech debt from one of my previous PRs.

The problem with the current naming is:

  1. Inconsistency (Delete vs Prune, when both perform the same DELETE call to the cluster)
  2. Inaccuracy (The timeout isn't for the DELETE call to the cluster but actually waiting until the poller performing GET calls verifies that the object is NotFound).

If I were setting a DeleteTimeout, I would expect it to include the DELETE call, but it does not. It only includes the waiting afterwards. This has come up as being confusing several times when trying to explain how the applier actually works.

Perhaps you would prefer WaitTimeout for both applying and deleting?

karlkfi avatar Oct 27 '21 00:10 karlkfi

Ok, so new proposal: How about we just use "Reconcile[d]" for both ideas. This is what I've done for the new WaitEvent operations in https://github.com/kubernetes-sigs/cli-utils/pull/463

So this would mean combining the reconcile/prune/delete timeout configs into one ReconcileTimeout, and treating Apply -> Current and Delete -> NotFound as effectively the same process.

karlkfi avatar Nov 04 '21 19:11 karlkfi

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 02 '22 23:03 k8s-triage-robot

/remove-lifecycle stale

karlkfi avatar Mar 25 '22 22:03 karlkfi

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jun 23 '22 22:06 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Jul 23 '22 23:07 k8s-triage-robot

/remove-lifecycle rotten /lifecycle frozen

karlkfi avatar Jul 24 '22 01:07 karlkfi