Add server level feature gate
What would you like to be added?
Add the Kubernetes style feature gate framework into etcd to gate future feature enhancement. Users would be able to turn features on or off for the single server and query feature enablement in a consistent way.
This issue is for tracking KEP-4578. Tasks in this KEP include
- Milestone 1
- [x] server feature gate interface defined, and code copied from K8s into
pkg/: https://github.com/etcd-io/etcd/pull/18062. - [x]
--feature-gatesflag added: https://github.com/etcd-io/etcd/pull/18279, https://github.com/etcd-io/etcd/pull/18365. - [x] feature gate added to the server code, and used by a server level experimental feature: https://github.com/etcd-io/etcd/pull/18359
- [x] feature metrics added. https://github.com/etcd-io/etcd/issues/19324
- Milestone 2
- [x] e2e tests added for the feature gate equivalent of the selected experimental feature.
- [x] robustness test scenarios added for the selected experimental feature(s).
- [x] documentation to track all feature gates added to [etcd-io/website]: https://github.com/etcd-io/website/issues/940.
- Milestone 3
- [x] migrate all existing
--experimentalserver level features to feature gates without removing the old flags. List of features to migrate:- [x]
experimental-stop-grpc-service-on-defrag: https://github.com/etcd-io/etcd/pull/18359 - [x]
experimental-enable-distributed-tracing: https://github.com/etcd-io/etcd/issues/19050 - [x]
experimental-initial-corrupt-check: https://github.com/etcd-io/etcd/pull/18478 - [x]
experimental-compact-hash-check-enabled: https://github.com/etcd-io/etcd/pull/19053 - [x]
experimental-enable-lease-checkpointandexperimental-enable-lease-checkpoint-persist: https://github.com/etcd-io/etcd/pull/19356 - [x]
experimental-memory-mlock: https://github.com/etcd-io/etcd/issues/19061 - [x]
experimental-txn-mode-write-with-shared-buffer: https://github.com/etcd-io/etcd/issues/19063
- [x]
Why is this needed?
To make it easier to add and use new features in etcd. Please refer to https://github.com/kubernetes/enhancements/issues/4578 for more details.
while https://github.com/kubernetes/enhancements/pull/4610 is still in review, I think there is no much contention regarding server level feature gate. we can get started on the implementation. /cc @stackbaek @henrybear327
Thanks @siyuanfoundation for the heads up! Happy to contribute to this feature :) Looking forward to it!
Can we wait for approvals for the KEP so the implementation done by contributors don't need to be updated when the design details are changed frustrating the contributors?
/cc @YoyinZyc
Several remaining issues/tasks:
- The flag
--experimental-set-member-localaddris missed. It's added in 3.6 only, so we can just change it to feature gate. assigned to @henrybear327- [x] Action: A PR is needed. Contribution is welcome. Thanks. https://github.com/etcd-io/etcd/pull/19413
- [ ] Action: Please also double check if we miss any other experimental flags
- All the deprecated fields are not correctly marked. The comment should begins with
Deprecated:, refer to https://go.dev/wiki/Deprecated. The benefit is that IDEs will automatically draw a line through the field, so with better awareness to the users. assigned to @ivanvc- [x] Action: A PR is needed. Contribution is welcome. Thanks. Done in https://github.com/etcd-io/etcd/pull/19401
- The deprecated comment/description isn't consistent across "help/flag description/field comment" etc. For example, for
--experimental-txn-mode-write-with-shared-buffer, we only added "deprecated ..." in the help.go, but not in the field comment, nor the flag description. Fixed by @ivanvc in https://github.com/etcd-io/etcd/pull/19408- [x] Action: A PR is needed. Contribution is welcome. Thanks
- For embedded use cases, if an experimental flag was migrated to a non-experimental flag, users aren't able to set the experimental field any more, It won't work. For example https://github.com/etcd-io/etcd/blob/6485d48f25f59458ae4a8fde3b00c50774a714c7/tests/integration/tracing_test.go#L59 Instead users have to set the related non-experimental field. It's a breaking change, we need to clearly document this.
- [ ] Action: documentation
- For embedded use cases, if an experimental flag was migrated to the feature gate, users aren't able to set the experimental field any more, It won't work. It's also a breaking change. We need to document this, also we need to add document to users how to set it.
- [ ] Action: documentation
cc @fuweid @jmhbnz @siyuanfoundation @henrybear327 @ivanvc @serathius
I can take on some tasks while the e2e PRs are being reviewed / merged! Please assign it directly to me and I will submit the PR before midnight!
Thanks @ahrtr
@henrybear327 thx, can you deliver a PR for The flag --experimental-set-member-localaddr is missed firstly?
@henrybear327 thx, can you deliver a PR for
The flag --experimental-set-member-localaddr is missedfirstly?
Duly noted.
I'll take the second one:
All the deprecated fields are not correctly marked.
The deprecated comment/description isn't consistent across "help/flag description/field comment" etc. For example, for --experimental-txn-mode-write-with-shared-buffer, we only added "deprecated ..." in the help.go, but not in the field comment, nor the flag description.
I did another pass at deprecations and fixed the following:
-
ExperimentalTxnModeWriteWithSharedBuffer -
ExperimentalStopGRPCServiceOnDefrag -
ExperimentalInitialCorruptCheck - Standardized all deprecation messages in
server/etcdmain/help.go
See PR: https://github.com/etcd-io/etcd/pull/19408
@henrybear327 thx, can you deliver a PR for
The flag --experimental-set-member-localaddr is missedfirstly?Duly noted.
@henrybear327 are you able to deliver a PR to fix this today (asap)? thx.
@henrybear327 thx, can you deliver a PR for
The flag --experimental-set-member-localaddr is missedfirstly?Duly noted.
@henrybear327 are you able to deliver a PR to fix this today (asap)? thx.
Yes, I am on it. Work is currently ongoing as I ran into some issues that I am debugging now!
@henrybear327 the migration of the missing experimental flag --experimental-set-member-localaddr has higher priority than the downgrade e2e test (https://github.com/etcd-io/etcd/pull/19399) to me.
@henrybear327 the migration of the missing experimental flag
--experimental-set-member-localaddrhas higher priority than the downgrade e2e test (#19399) to me.
Sorry for the delay, initial attempt is put up.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.
I believe we can now close this issue, as all tasks have been completed. Unless I'm missing something?
I think so. @siyuanfoundation could you please confirm if we could close this one now?
Closing. I believe there are no remaining action items. Please feel free to re-open otherwise.