datadog-operator
datadog-operator copied to clipboard
conditions required value
(note: not 100% sure this is the right place, since it is CRD related - may appropriately belong against here, however I presume that CRD is generated from somewhere, like here!)
Describe what happened: When applying a fresh DatadogAgent v2alpha1 resource with operator 1.0.0-rc.8 without a previous DatadogAgent CRD deployed the following two CRD validations are thrown:
-
* status.conditions.reason: Required value
-
* status.conditions.message: Required value
Describe what you expected:
Previously conditions had omitEmpty
set on it, compared to now. I think this is correct, as the status is set by the controller -- but without being able to deploy it, it can never be set.
This may be a red-herring tho, as the v1 CRD has required for status
and type
, and v2 has required for status
, type
, reason
, message
, and lastTransitionTime
.
Workaround: Patch the CRD to remove the required values
hitting this problem too, in our case the DatadogAgent was created manually first then we want to annotate it as managed by helm. The annotation cannot be applied because the agent status is not valid. You can't edit the agent at all (even to say remove the entire status) because this keeps re-appearing; the only solution is to edit the CRD as described above to remove the extra required fields.
I think this is a dupe of #654 tho
Ok, I read a lot of the code and now understand this one better. It is a bug, but the correct solution is not the one described above.
What happened was we had deployed a v1alpha1 DatadogAgent resource. This uses a DatadogCondition type with optional reason/message https://github.com/DataDog/datadog-operator/blob/deac79437c21c685ca368b729703468836cc5517/apis/datadoghq/v1alpha1/datadogagent_types.go#L1370-L1387 however, whenever you deploy the v2alpha1 crd, kubectl will fetch using the latest api version by default (see https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#version-priority)
apiVersion: v1
items:
- apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
annotations:
meta.helm.sh/release-name: datadog-agent
meta.helm.sh/release-namespace: default
creationTimestamp: "2023-02-13T22:46:11Z"
finalizers:
BUT you can tell kubectl explicitly the api version you want to use:
$ kubectl get datadogagents.v1alpha1.datadoghq.com -o yaml |head
apiVersion: v1
items:
- apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
annotations:
meta.helm.sh/release-name: datadog-agent
meta.helm.sh/release-namespace: default
creationTimestamp: "2023-02-13T22:46:11Z"
finalizers:
If you use the v1alpha1 api, you can happily edit, label, annotate this resource. If you don't specify the version, it's broken.
From an end user perspective this means you always need to use datadogagents.v1alpha1.datadoghq.com
to refer to these resources safely.
So where's the bug? Well... this crd has storage set 'true' for v1alpha1, but served is 'true' for both v1alpha1 and v2alpha1, so kubectl will always pick the version that isn't stored by default. This seems like an odd choice to me. This could work if the operator always filled the (optional) reason and message fields in the older definition, or if the storage field was set true on the newer type (and possibly add a conversion webhook)
aaaand digging just a little further I see the storage flag was switched to v2alpha1. So how did we get this odd mix...? It was the helm chart. https://artifacthub.io/packages/helm/datadog/datadog-operator#values
$ grep -C1 -E 'storage:|alpha1' charts/datadog-crds/templates/datadoghq.com_datadogagents_v1.yaml
type: date
name: v1alpha1
schema:
--
type: object
{{- if not (eq .Values.migration.datadogAgents.version "v2alpha1") }}
served: true
storage: true
{{- else }}
served: true
storage: false
{{- end }}
--
type: date
name: v2alpha1
schema:
--
type: object
{{- if eq .Values.migration.datadogAgents.version "v2alpha1" }}
served: true
storage: true
{{- else }}
served: true
storage: false
{{- end }}
if you set migration.datadogAgents.version to "v1alpha1" (which is the default), then v2 is served but not stored - this is the broken configuration! If you set it to "v2alpha1", it becomes the served-and-stored format which should work. That second stanza should read:
{{- if eq .Values.migration.datadogAgents.version "v2alpha1" }}
served: true
storage: true
{{- else }}
served: false
storage: false
{{- end }}
to avoid what's happened here
Hello all - Apologies for the headache. As mentioned in the other issue (#662) we just released the Datadog Operator along with a major chart update.
We documented the migration path here. It touches on storage versions too, which is now v2alpha
stored by default.
Do let me know if this helps fixing your issue, and if you end up migrating feel free to share your feedback, we want to ensure a smooth experience for all of our users.
BUT you can tell kubectl explicitly the api version you want to use:
This saved my day, thanks so much @bewinsnw!
I got a stuck DatadogAgent (from v1alpha1) I couldn't delete/remove/edit after installing new CRDs that include v2alpha1.
This was NOT working for me and hit the error mentioned at the beginning:
$ k edit datadogagent datadog
# datadogagents.datadoghq.com "datadog" was not valid:
# * status.conditions.reason: Required value
# * status.conditions.message: Required value
This always ended up in:
error: Edit cancelled, no valid changes were saved.
Fortunately, after reading your message @bewinsnw I was able to do instead:
$ k edit datadogagents.v1alpha1.datadoghq.com datadog
datadogagent.datadoghq.com/datadog edited
As you can see, this worked fine. I was able to remove the finalizer and properly delete the stuck resource finally.
So, thanks a lot!
Closing the issue as it was originally opened for DatadogAgent v1alpha1
version which is now deprecated.