crossplane-runtime
crossplane-runtime copied to clipboard
Late initialization overrides desired value when it is zero-value
What happened?
We have implemented late-initialization pattern across the resources. However, it relies upon the field's value being nil or zero-value. Example late-init functions from stack-gcp:
// LateInitializeInt64 implements late initialization for int64 type.
func LateInitializeInt64(i *int64, from int64) *int64 {
if i != nil || from == 0 {
return i
}
return &from
}
// LateInitializeBool implements late initialization for bool type.
func LateInitializeBool(b *bool, from bool) *bool {
if b != nil || !from {
return b
}
return &from
}
This implementation is buggy because it overrides when the user actually wants the zero-value for that field. For example, the value of field foo *bool
is true
in both spec and cloud provider and the user wants to make it false
. In that case LateInitializeBool
will override that value with true
because it will assume that zero-value is a case where user doesn't have an opinion about that field and the value from cloud provider will be used. This seems easy to fix for pointer types by converting the function into:
// LateInitializeBool implements late initialization for bool type.
func LateInitializeBool(b *bool, from bool) *bool {
if b != nil {
return b
}
return &from
}
But there is a catch. We follow Kubernetes pattern for optional fields in our API design, which means that optional fields have the following jsontag:
// +optional
foo *string `json:"foo,omitempty"`
The tag omitempty
actually causes the value to be nil
in the Go code if assigned value is zero-value, in this case false
. So, it comes as nil
even if user wanted it to be false
. This works for Kubernetes late-inited fields like pod.spec.nodeName
because those fields will be eventually filled but that's not the case for us. The Go zero-value for a field could actually be the final value that the user wants for that field.
I'd like us to discuss not using omitempty
tag at all, so that zero-values come to the controller as is instead of being converted to nil
. In terms of api-server validation, nothing changes as // +optional
tag is used to mark the fields as required or not in the CRD. Though not using omitempty
is divergence from the optional/required design pattern that Kubernetes suggests.
/cc @negz @hasheddan
This is tough because I agree that the solution here is just not using omitempty
at all, but that would be a pretty big divergence. However, I feel that the way it is used even in Kubernetes native types is a bit of an anti-pattern. If I understand correctly, our code would not have to change except for literally removing the omitempty
tags because we still want to handle nil
optional values in the same manner, but right now we are being given nil
values in some cases where we do not intend them (i.e. providing a zero-values explicitly to a pointer type). Is this your understanding as well?
It doesn't sound like omitting omitempty goes against conventions:
In most cases, optional fields should also have the omitempty struct tag (the omitempty option specifies that the field should be omitted from the json encoding if the field has an empty value). However, If you want to have different logic for an optional field which is not provided vs. provided with empty values, do not use omitempty (e.g. https://github.com/kubernetes/kubernetes/issues/34641).
@muvaf @negz do we feel good about removing omitempty
across the board? If so, I would like to go ahead and move on this while implementing more v1beta1
resources.
@hasheddan Are you confident it won't have any negative side-effects/behaviour changes? If so I'm okay with it.
I think I'll need to do some experiments to make sure we're able to capture all three: nil value, zero value, actual value. Or @hasheddan could also play with it and report here.
@muvaf yep I'll pull some of our real-world examples and show how the change would impact :+1:
https://github.com/kubernetes-sigs/cluster-api/issues/707
Some more prior art here. Having swapped back in the context of this issue I think we're on the right path. I'm not sure if we want to completely remove omitempty
everywhere, but we should remove it for fields for which we want to distinguish between:
- Spec author has no opinion, propagate the default value back from the cloud.
- Spec author wants the field to be the zero value.
- Spec author wants the field to be a non-zero value.
So we'd represent these cases (using an *int field as example) as:
- Spec author omits the field, which gets serialized as a
nil
pointer. - Spec author writes 0, which gets serialised as a pointer to int 0.
- Spec author writes 8, which gets serialised as a pointer to int 8.
https://crossplane.slack.com/archives/CKXQHM7U3/p1581113468101800 https://play.golang.org/p/NnWsj9BaTeJ https://play.golang.org/p/oxQgQxtWK62
Cross-linking some discussion from Slack. I actually now think our late init code is ~working as expected~ a little broken when it comes to late-initializing an unspecified spec field from a cloud provider's zero value. 🤔 I believe:
func LateInitializeBool(b *bool, from bool) *bool {
if b != nil || !from {
return b
}
return &from
}
In the above case b
is the user provided value and from
is the value we get back from the cloud provider. If we walk through some cases, assuming the following struct:
type Example struct {
Field *bool `json:"field,omitempty"`
}
- The user writes
field: false
. This is unmarshalled asField: *false
. Thereforeb != nil
, so we returnb
, thus keeping the user provided value without ever considering whatfrom
(the cloud provider value) was set to. - The user writes
field: true
. Same as above - it's not nil so we keep it. - The user omits
field
from their spec. This is unmarshalled asField: nil
, sob == nil
. We deem the user to have no opinion about the value of this field and proceed to late initialisation from the cloud provider value.- If the provider value was
true
, we late initializeField: *true
. - If the provider value was
false
, we returnb
, which isnil
.
- If the provider value was
So actually in the case of *bool
fields it seems like the alternative logic @muvaf proposed would actually be better:
func LateInitializeBool(b *bool, from bool) *bool {
if b != nil {
return b
}
return &from
}
With this logic if the cloud provider specified false
and the user had no opinion (omitted the field) we'd late initialize the field to *false
, rather than nil
.
I think the two external issues relating to omitempty
that were linked in this issue were red herrings, because they deal with fields that are slices (not pointer to slices).
Another option would be to add a flag to the CRDs that disallow late init stuff? We may need to validate that all late init fields has a value if this new flag is set.
This will allow us to have complete control using the Crossplane CRDs.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Crossplane does not currently have enough maintainers to address every issue and pull request. This issue has been automatically marked as stale
because it has had no activity in the last 90 days. It will be closed in 14 days if no further activity occurs. Leaving a comment starting with /fresh
will mark this issue as not stale.