mcs-api
mcs-api copied to clipboard
apis/v1alpha2: collapse ServiceImport spec and status fields to root
Refs https://github.com/kubernetes-sigs/mcs-api/issues/48#issuecomment-2043727874
The source of truth for these values are on the exported Service and these fields should generally be written to by a controller, not a human. Manually updating these fields could cause unexpected behavior.
This would be a breaking change, necessitating a v1alpha2 API version.
We expect this change should have minimal impact on existing users of either v1alpha1 or forked CRDs though because they largely interact with ServiceImport through the foo.clusterset.local address, not the CRD directly.
- ~~GKE has forked the CRD to make this change, and fixing this upstream I hope could allow them to de-fork.~~
- Azure Fleet has ~~similarly~~ forked the CRD to make this same change and engineers have asked about this in Kubernetes Slack while building an MCS implementation, and have indicated support for changing this upstream and adopting MCS API CRDs directly if this change were to be made.
I'm hoping that fixing this directly in the upstream spec would be an alternative to #20, and that all implementations would be able to use the MCS API CRDs directly.
UPDATE: Notably, removing the possibility of adding user-editable fields under spec likely means that having a multi-cluster controller automatically generating a ServiceImport with an appropriate name and placing it in an appropriate namespace for service networking beyond ClusterSet boundaries, where "sameness" can not be assumed may not be possible to automate.
~~I'm proposing to retain the now-empty .spec stanza because I believe there is future additive scope (intentionally excluded from this proposal) for the ability to export a service beyond a ClusterSet to be imported/consumed in a different Cluster which may not have the same "sameness" guarantees.~~
A usage pattern where a user would manually create and name a ServiceImport (instead of an automated controller managing the resource) would have been a way to handle cross-ClusterSet service networking, using spec fields to handle mapping to a ClusterSet-external exported service "known" to the cluster, and adding a status.conditions field to ServiceImport for a controller to report if this attempted mapping was successful. However, this pattern has not seen sufficient demand to justify enabling, and may have concerns around overly-broad permissions or complex handshake UX between clusters in unassociated ClusterSets.
With this change to the API it will likely become impractical to implement this pattern, but we largely view this as an acceptable tradeoff and the typical way to access remote services beyond the ClusterSet boundary should likely continue to be through a gateway pattern.
Other MCS API implemetations from which to solicit feedback on this change:
- [x] Cilium - implementation coming soon, ServiceImport managed by controller, supportive of this change
- [x] Submariner Lighthouse (uses v1alpha1 CRDs directly)
- [ ] Karmada
- Appears to require having a user manually create the ServiceImport resource, setting the
.spec.typeand.spec.portsfields. Would be helpful to get direct feedback on whether this is desirable for any specific functionality (such as only exporting selected ports?), or whether the UX would be simplified if a controller set these in status automatically based on the underlying exported Services.
- Appears to require having a user manually create the ServiceImport resource, setting the
- [ ] AWS Cloud Map MCS Controller
- In a blog post the v1alpha1 CRDs appear to be used directly, and includes log snippets showing
"ServiceImport IPs need update"and"updated ServiceImport"indicating that the Cloud Map controller is modifying these fields.
- In a blog post the v1alpha1 CRDs appear to be used directly, and includes log snippets showing
- [ ] Antrea
- Appears to use v1alpha1 MCS API CRDs directly, I haven't investigated further implementation details.
Please add other known implementations!
Cilium implementation is still WIP (but a mere few weeks, maybe a month from completion). And we are planning to only have ServiceImport auto created/deleted by our controllers right now like suggested by the KEP so from a Cilium perspective this looks good to me!
As per @skitt comment in office hours he mentions that Liqo is also a MCS API stakeholder
cc @mikemorris
As per @skitt comment in office hours he mentions that Liqo is also a MCS API stakeholder
cc @mikemorris
Looking into this a bit more I’m not sure Liqo actually implements the MCS API.
Hi for what it's worth per the convo in SIG-MC the docs for GKE are wrong and ips is in the spec stanza. That being said at SIG-mc 7/23 we were leaning (myself included) towards moving it to be a root field
Cilium implementation is still WIP (but a mere few weeks, maybe a month from completion). And we are planning to only have ServiceImport auto created/deleted by our controllers right now like suggested by the KEP so from a Cilium perspective this looks good to me!
I wonder if you can share any link to that implementation? Is it completed?
Cilium implementation is still WIP (but a mere few weeks, maybe a month from completion). And we are planning to only have ServiceImport auto created/deleted by our controllers right now like suggested by the KEP so from a Cilium perspective this looks good to me!
I wonder if you can share any link to that implementation? Is it completed?
Sure It's mainly here https://github.com/cilium/cilium/tree/main/pkg/clustermesh/mcsapi and there is some other interesting bits here https://github.com/cilium/cilium/tree/main/pkg/clustermesh/endpointslicesync. It's not fully done yet unfortunately but very soon! The last major thing (the service import controller) is in a PR being reviewed/improved and then we still need to do a few things here and there (like integrating the conformance/e2e tests in our CI) but our implementation would be in a working state. The next cilium release is beginning of next year so we still have some times to iron out the last details anyway!
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: mikemorris Once this PR has been reviewed and has the lgtm label, please assign skitt for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Hi for what it's worth per the convo in SIG-MC the docs for GKE are wrong and ips is in the spec stanza. That being said at SIG-mc 7/23 we were leaning (myself included) towards moving it to be a root field
Proposal, diff and PR description have been updated per this discussion, please take another look as this should now be ready for review!
Running make I see the following errors, and I'm not quite sure where/what to specify for these in alignment with https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md#incompatible-api-changes and stepped transitions like https://static.sched.com/hosted_files/kcsna2022/75/KubeCon%20Detroit_%20Building%20a%20k8s%20API%20with%20CRDs.pdf#page=17 (I'm not sure to what extent this is applicable for a breaking alpha change)
sigs.k8s.io/mcs-api/pkg/apis/v1alpha1:-: CRD for ServiceExport.multicluster.x-k8s.io has no storage version
sigs.k8s.io/mcs-api/pkg/apis/v1alpha2:-: CRD for ServiceImport.multicluster.x-k8s.io has no storage version
@mikemorris does this help?
- https://book.kubebuilder.io/multiversion-tutorial/api-changes.html#storage-versions
With the introduction of a 2nd version, we'll need to mark one (v1alpha2 if I understand how this works correctly) w/ the // +kubebuilder:storageversion pragma.
@mikemorris does this help?
* https://book.kubebuilder.io/multiversion-tutorial/api-changes.html#storage-versionsWith the introduction of a 2nd version, we'll need to mark one (v1alpha2 if I understand how this works correctly) w/ the
// +kubebuilder:storageversionpragma.
I can confirm this works.
This also needs a few changes to the build infrastructure to support generating code for both packages:
--- a/hack/update-codegen.sh
+++ b/hack/update-codegen.sh
@@ -28,7 +28,7 @@ gobin="${GOBIN:-$(go env GOPATH)/bin}"
OUTPUT_PKG=sigs.k8s.io/mcs-api/pkg/client
OUTPUT_DIR=$SCRIPT_ROOT/pkg/client
-FQ_APIS=sigs.k8s.io/mcs-api/pkg/apis/v1alpha1
+FQ_APIS=(sigs.k8s.io/mcs-api/pkg/apis/v1alpha1 sigs.k8s.io/mcs-api/pkg/apis/v1alpha2)
CLIENTSET_NAME=versioned
CLIENTSET_PKG_NAME=clientset
@@ -44,22 +44,22 @@ fi
COMMON_FLAGS="--go-header-file ${SCRIPT_ROOT}/hack/boilerplate.go.txt"
echo "Generating clientset at ${OUTPUT_PKG}/${CLIENTSET_PKG_NAME}"
-"${gobin}/client-gen" --clientset-name "${CLIENTSET_NAME}" --input-base "" --input "${FQ_APIS}" --output-pkg "${OUTPUT_PKG}/${CLIENTSET_PKG_NAME}" --output-dir "$OUTPUT_DIR/$CLIENTSET_PKG_NAME" ${COMMON_FLAGS}
+"${gobin}/client-gen" --clientset-name "${CLIENTSET_NAME}" --input-base "" "${FQ_APIS[@]/#/--input=}" --output-pkg "${OUTPUT_PKG}/${CLIENTSET_PKG_NAME}" --output-dir "$OUTPUT_DIR/$CLIENTSET_PKG_NAME" ${COMMON_FLAGS}
echo "Generating listers at ${OUTPUT_PKG}/listers"
-"${gobin}/lister-gen" "${FQ_APIS}" --output-pkg "${OUTPUT_PKG}/listers" --output-dir "${OUTPUT_DIR}/listers" ${COMMON_FLAGS}
+"${gobin}/lister-gen" "${FQ_APIS[@]}" --output-pkg "${OUTPUT_PKG}/listers" --output-dir "${OUTPUT_DIR}/listers" ${COMMON_FLAGS}
echo "Generating informers at ${OUTPUT_PKG}/informers"
"${gobin}/informer-gen" \
- "${FQ_APIS}" \
+ "${FQ_APIS[@]}" \
--versioned-clientset-package "${OUTPUT_PKG}/${CLIENTSET_PKG_NAME}/${CLIENTSET_NAME}" \
--listers-package "${OUTPUT_PKG}/listers" \
--output-pkg "${OUTPUT_PKG}/informers" \
--output-dir "${OUTPUT_DIR}/informers" \
${COMMON_FLAGS}
-echo "Generating register at ${FQ_APIS}"
-"${gobin}/register-gen" ${FQ_APIS} --output-file zz_generated.register.go ${COMMON_FLAGS}
+echo "Generating register at" "${FQ_APIS[@]}"
+"${gobin}/register-gen" "${FQ_APIS[@]}" --output-file zz_generated.register.go ${COMMON_FLAGS}
if [[ "${VERIFY_CODEGEN:-}" == "true" ]]; then
Regarding my small research about how we could do the transition between v1alpha1 and v1alpha2:
AFAICT the current way would be breaking for the consumer of the ServiceImport resource as per:
if the default strategy value None is specified, the only modifications to the object are changing the apiVersion string and perhaps [pruning unknown fields](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#field-pruning) (depending on the configuration). Note that this is unlikely to lead to good results if the schemas differ between the storage and requested version. In particular, you should not use this strategy if the same data is represented in different fields between versions.
(https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#writing-reading-and-updating-versioned-customresourcedefinition-objects)
It means that controller querying v1alpha1 while the object is stored with v2alpha2 and the opposite should probably see bogus/empty fields. It's true that ServiceImport is entirely managed by the MCS-API implementation so this may only be a very temporary situation before everything is migrated to v1alpha2. An example of such controller reading ServiceImport could be for answering DNS request (I will assume it's the main thing for the rest of the post) which means that a controller reading v1alpha1 ServiceImport will see bogus ServiceImport if they are migrating/migrated to v1alpha1 and a controller reading v1alpha2 will see bogus ServiceImport that are not migrated to v1alpha1.
So now for the option:
- Let the implementation handle this This assume that we don't do much in this transition and let implementation do the hard work. This may include:
- Implementing a conversion webhook
- Some strategy like freezing the ServiceImport cache of the controllers for the old service version still watching v1alpha1, converting ServiceImport to v1alpha2 by its controller and continuing the rollout once done
- Acknowledging there will be disturbance/downtime during this transition and notifying the user about it
- Some other things?
-
Implementing a conversion webhook in this repo We could implement a webhook conversion: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#webhook-conversion which would make the transition with v1alpha1 and v1alpha2 seamless: some controllers requesting v1alpha1 while a resource is in v1alpha2 will be converted by the webhook and vice-versa. We could either provide the full webhook service in this repo or only expose the appropriate functions that should be integrated in a real service by implementations.
-
Double publishing fields on the CRD We could also make v1alpha2 an union of current v1alpha2 and v1alpha1 fields and suggest implementation to double publish the info to both fields in the spec and outside which should result in controller querying v1alpha1 would still get valid object and we could start controllers querying v1alpha2 with the old alpha1 deprecated fields and then make them start querying the new fields once the initial migration phase is complete (once every ServiceImport will be reprocessed by its controller). We would then remove those deprecated field on the next version (v1beta1?). This should make the transition seamless but is non conventional to say the least...
-
Do not do this transition We could decide too that this transition is not worth it vs the migration process and not do it.
I believe that Submariner would be one of main project affected by this as you have some productions users "endorsed by the project" as many other project supporting MCS-API have some warning regarding production usage or is not released in a stable version yet IIUC. So I would say you should have the most say in what ways we should go @tpantelis @skitt.
FYI I am hoping to get to the bottom of this before next Cilium version is released (first version to include MCS-API support!), considering we have a feature freeze somewhere in December and next release should be in January/February (guessing that if it's to skip a CRD transition and that we miss time here it may go in the feature freeze period before the stable release). So hopefully we can figure this out soon :pray:.
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle rotten - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
If I'm stepping on your toes @mikemorris apologies! Feel free to close if this particular work has atrophied and/or been replaced by something else.
Closing so we can focus on completing this work in #85