datadog-operator
datadog-operator copied to clipboard
Add notify_by, groupby_simple_monitor, & renotify_occurrences config options to monitors crd
What does this PR do?
Provides the notify_by, groupby_simple_monitor & renotify_occurrences monitor options
Motivation
Closes: https://github.com/DataDog/datadog-operator/issues/833 https://github.com/DataDog/datadog-operator/issues/814
Additional Notes
groupby_simple_monitor
only updates Log Monitors. renotifyOccurrences
requires renotifyInterval
to be configured. The unit test in monitor_test.go only tests the monitor can be built and passed to the API go client, ultimately the validation is handled by the API go client, returning a 4xx if invalid.
Minimum Agent Versions
N/A
Describe your test plan
- Checkout branch levipe01/monitors_crd_additional_options
- Update
config/manager/manager.yaml
to disable the webhook, enable Datadog monitors and apply Datadog API/App keys:
spec:
containers:
- command:
- /manager
args:
- --enable-leader-election
- --pprof
- --webhookEnabled=false
- --datadogMonitorEnabled=true
...
env:
- name: DD_API_KEY
valueFrom:
secretKeyRef:
name: "datadog-secret"
key: api-key
- name: DD_APP_KEY
valueFrom:
secretKeyRef:
name: "datadog-secret"
key: app-key
-
Following steps from how-to-contribute.md and using a kind cluster run:
- make build
- make IMG=test/operator:test IMG_CHECK=test/operator-check:test docker-build
- kind load docker-image test/operator:test
- kind load docker-image test/operator-check:test
- make IMG=test/operator:test IMG_CHECK=test/operator-check:test deploy
-
Apply a DatadogAgent with the below config to generate logs:
apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
name: datadog
spec:
global:
kubelet:
tlsVerify: false
clusterName: kind
credentials:
apiSecret:
keyName: api-key
secretName: datadog-secret
features:
logCollection:
enabled: true
containerCollectAll: true
- For each config option:
- Apply a new log monitor using the below config (subbing in each new config option separately):
apiVersion: datadoghq.com/v1alpha1
kind: DatadogMonitor
metadata:
name: datadog-log-alert-test
namespace: system
spec:
query: "logs(\"source:agent\").index(\"main\").rollup(\"count\").by(\"status,host\").last(\"1h\") > 5"
type: "log alert"
name: "Test log alert made from DatadogMonitor"
message: "1-2-3 testing"
tags:
- "test:datadog"
priority: 5
options:
renotifyInterval: 1
renotifyOccurrences: 3
groupbySimpleMonitor: false
notifyBy: ["abc"]
- For each config option, passing in an invalid setting should result in a 4xx error visible in the logs of the operator. For example the above
notifyBy
config should yield the below log:
{"level":"ERROR","ts":"2024-03-15T17:00:01Z","logger":"controllers.DatadogMonitor","msg":"error updating monitor","datadogmonitor":"system/datadog-log-alert-test","Monitor ID":141258523,"error":"error validating monitor: 400 Bad Request: {\"errors\":[\"Invalid notify_by value found: 'abc'; values must match the query's group keys: 'host,status'\"]}"}
-
Adding a valid config should update the below fields in the UI (visible when editing the generated monitor):
-
Changing the configuration with another valid setting should update the monitor successfully.
-
Deleting the corresponding DatadogMonitor object should delete the monitor from the UI.
Checklist
- [x] PR has at least one valid label:
bug
,enhancement
,refactoring
,documentation
,tooling
, and/ordependencies
- [x] PR has a milestone or the
qa/skip-qa
label
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 58.97%. Comparing base (
e0fa00d
) to head (ae74c3e
).
Additional details and impacted files
@@ Coverage Diff @@
## main #1123 +/- ##
==========================================
+ Coverage 58.95% 58.97% +0.01%
==========================================
Files 174 174
Lines 21371 21380 +9
==========================================
+ Hits 12600 12609 +9
Misses 8004 8004
Partials 767 767
Flag | Coverage Δ | |
---|---|---|
unittests | 58.97% <100.00%> (+0.01%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Files | Coverage Δ | |
---|---|---|
apis/datadoghq/v1alpha1/datadogmonitor_types.go | 100.00% <ø> (ø) |
|
controllers/datadogmonitor/monitor.go | 68.39% <100.00%> (+1.54%) |
:arrow_up: |
Continue to review full report in Codecov by Sentry.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update e0fa00d...ae74c3e. Read the comment docs.