etcd icon indicating copy to clipboard operation
etcd copied to clipboard

Add server level feature gate

Open siyuanfoundation opened this issue 1 year ago • 17 comments

What would you like to be added?

Add the Kubernetes style feature gate framework into etcd to gate future feature enhancement. Users would be able to turn features on or off for the single server and query feature enablement in a consistent way.

This issue is for tracking KEP-4578. Tasks in this KEP include

  • Milestone 1
  • [x] server feature gate interface defined, and code copied from K8s into pkg/: https://github.com/etcd-io/etcd/pull/18062.
  • [x] --feature-gates flag added: https://github.com/etcd-io/etcd/pull/18279, https://github.com/etcd-io/etcd/pull/18365.
  • [x] feature gate added to the server code, and used by a server level experimental feature: https://github.com/etcd-io/etcd/pull/18359
  • [x] feature metrics added. https://github.com/etcd-io/etcd/issues/19324
  • Milestone 2
  • [x] e2e tests added for the feature gate equivalent of the selected experimental feature.
  • [x] robustness test scenarios added for the selected experimental feature(s).
  • [x] documentation to track all feature gates added to [etcd-io/website]: https://github.com/etcd-io/website/issues/940.
  • Milestone 3
  • [x] migrate all existing --experimental server level features to feature gates without removing the old flags. List of features to migrate:
    • [x] experimental-stop-grpc-service-on-defrag: https://github.com/etcd-io/etcd/pull/18359
    • [x] experimental-enable-distributed-tracing: https://github.com/etcd-io/etcd/issues/19050
    • [x] experimental-initial-corrupt-check: https://github.com/etcd-io/etcd/pull/18478
    • [x] experimental-compact-hash-check-enabled: https://github.com/etcd-io/etcd/pull/19053
    • [x] experimental-enable-lease-checkpoint and experimental-enable-lease-checkpoint-persist: https://github.com/etcd-io/etcd/pull/19356
    • [x] experimental-memory-mlock: https://github.com/etcd-io/etcd/issues/19061
    • [x] experimental-txn-mode-write-with-shared-buffer: https://github.com/etcd-io/etcd/issues/19063

Why is this needed?

To make it easier to add and use new features in etcd. Please refer to https://github.com/kubernetes/enhancements/issues/4578 for more details.

siyuanfoundation avatar May 17 '24 17:05 siyuanfoundation

while https://github.com/kubernetes/enhancements/pull/4610 is still in review, I think there is no much contention regarding server level feature gate. we can get started on the implementation. /cc @stackbaek @henrybear327

siyuanfoundation avatar May 17 '24 17:05 siyuanfoundation

Thanks @siyuanfoundation for the heads up! Happy to contribute to this feature :) Looking forward to it!

henrybear327 avatar May 17 '24 21:05 henrybear327

Can we wait for approvals for the KEP so the implementation done by contributors don't need to be updated when the design details are changed frustrating the contributors?

serathius avatar May 21 '24 19:05 serathius

/cc @YoyinZyc

siyuanfoundation avatar Jul 31 '24 20:07 siyuanfoundation

Several remaining issues/tasks:

  • The flag --experimental-set-member-localaddr is missed. It's added in 3.6 only, so we can just change it to feature gate. assigned to @henrybear327
    • [x] Action: A PR is needed. Contribution is welcome. Thanks. https://github.com/etcd-io/etcd/pull/19413
    • [ ] Action: Please also double check if we miss any other experimental flags
  • All the deprecated fields are not correctly marked. The comment should begins with Deprecated:, refer to https://go.dev/wiki/Deprecated. The benefit is that IDEs will automatically draw a line through the field, so with better awareness to the users. assigned to @ivanvc
    • [x] Action: A PR is needed. Contribution is welcome. Thanks. Done in https://github.com/etcd-io/etcd/pull/19401
  • The deprecated comment/description isn't consistent across "help/flag description/field comment" etc. For example, for --experimental-txn-mode-write-with-shared-buffer, we only added "deprecated ..." in the help.go, but not in the field comment, nor the flag description. Fixed by @ivanvc in https://github.com/etcd-io/etcd/pull/19408
    • [x] Action: A PR is needed. Contribution is welcome. Thanks
  • For embedded use cases, if an experimental flag was migrated to a non-experimental flag, users aren't able to set the experimental field any more, It won't work. For example https://github.com/etcd-io/etcd/blob/6485d48f25f59458ae4a8fde3b00c50774a714c7/tests/integration/tracing_test.go#L59 Instead users have to set the related non-experimental field. It's a breaking change, we need to clearly document this.
    • [ ] Action: documentation
  • For embedded use cases, if an experimental flag was migrated to the feature gate, users aren't able to set the experimental field any more, It won't work. It's also a breaking change. We need to document this, also we need to add document to users how to set it.
    • [ ] Action: documentation

cc @fuweid @jmhbnz @siyuanfoundation @henrybear327 @ivanvc @serathius

ahrtr avatar Feb 12 '25 13:02 ahrtr

I can take on some tasks while the e2e PRs are being reviewed / merged! Please assign it directly to me and I will submit the PR before midnight!

Thanks @ahrtr

henrybear327 avatar Feb 12 '25 13:02 henrybear327

@henrybear327 thx, can you deliver a PR for The flag --experimental-set-member-localaddr is missed firstly?

ahrtr avatar Feb 12 '25 13:02 ahrtr

@henrybear327 thx, can you deliver a PR for The flag --experimental-set-member-localaddr is missed firstly?

Duly noted.

henrybear327 avatar Feb 12 '25 13:02 henrybear327

I'll take the second one:

All the deprecated fields are not correctly marked.

ivanvc avatar Feb 12 '25 15:02 ivanvc

The deprecated comment/description isn't consistent across "help/flag description/field comment" etc. For example, for --experimental-txn-mode-write-with-shared-buffer, we only added "deprecated ..." in the help.go, but not in the field comment, nor the flag description.

I did another pass at deprecations and fixed the following:

  • ExperimentalTxnModeWriteWithSharedBuffer
  • ExperimentalStopGRPCServiceOnDefrag
  • ExperimentalInitialCorruptCheck
  • Standardized all deprecation messages in server/etcdmain/help.go

See PR: https://github.com/etcd-io/etcd/pull/19408

ivanvc avatar Feb 13 '25 00:02 ivanvc

@henrybear327 thx, can you deliver a PR for The flag --experimental-set-member-localaddr is missed firstly?

Duly noted.

@henrybear327 are you able to deliver a PR to fix this today (asap)? thx.

ahrtr avatar Feb 13 '25 07:02 ahrtr

@henrybear327 thx, can you deliver a PR for The flag --experimental-set-member-localaddr is missed firstly?

Duly noted.

@henrybear327 are you able to deliver a PR to fix this today (asap)? thx.

Yes, I am on it. Work is currently ongoing as I ran into some issues that I am debugging now!

henrybear327 avatar Feb 13 '25 08:02 henrybear327

@henrybear327 the migration of the missing experimental flag --experimental-set-member-localaddr has higher priority than the downgrade e2e test (https://github.com/etcd-io/etcd/pull/19399) to me.

ahrtr avatar Feb 13 '25 10:02 ahrtr

@henrybear327 the migration of the missing experimental flag --experimental-set-member-localaddr has higher priority than the downgrade e2e test (#19399) to me.

Sorry for the delay, initial attempt is put up.

henrybear327 avatar Feb 13 '25 12:02 henrybear327

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar May 29 '25 00:05 github-actions[bot]

I believe we can now close this issue, as all tasks have been completed. Unless I'm missing something?

ivanvc avatar May 29 '25 04:05 ivanvc

I think so. @siyuanfoundation could you please confirm if we could close this one now?

wenjiaswe avatar Jun 12 '25 13:06 wenjiaswe

Closing. I believe there are no remaining action items. Please feel free to re-open otherwise.

ivanvc avatar Jul 18 '25 21:07 ivanvc