ovn-kubernetes
ovn-kubernetes copied to clipboard
Add IPsec tunnel state metric for IKE Child SA establishment
This PR registers metric ovnkube_controller_ipsec_tunnel_state containing local_ip and remote_ip as the labels for reporting ipsec status for every Geneve tunnel. When IKE Child SA is established for the tunnel, then metric for the tunnel
is set with 1, otherwise it will be set with 0. The IPsec tunnel status is being checked periodically for every 5s and metric will be updated accordingly.
Pull Request Test Coverage Report for Build 17601206192
Details
- 153 of 226 (67.7%) changed or added relevant lines in 5 files are covered.
- 45 unchanged lines in 4 files lost coverage.
- Overall coverage decreased (-0.005%) to 54.887%
| Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
|---|---|---|---|
| go-controller/pkg/metrics/node.go | 0 | 5 | 0.0% |
| go-controller/pkg/util/ovs.go | 2 | 8 | 25.0% |
| go-controller/pkg/metrics/ipsec.go | 18 | 25 | 72.0% |
| go-controller/pkg/metrics/ovs.go | 41 | 54 | 75.93% |
| go-controller/pkg/metrics/ovnkube_controller.go | 92 | 134 | 68.66% |
| <!-- | Total: | 153 | 226 |
| Files with Coverage Reduction | New Missed Lines | % |
|---|---|---|
| go-controller/pkg/ovn/namespace.go | 2 | 56.31% |
| go-controller/pkg/ovn/egressip.go | 4 | 67.84% |
| go-controller/pkg/ovn/controller/network_qos/network_qos.go | 15 | 63.56% |
| go-controller/pkg/ovn/controller/network_qos/network_qos_namespace.go | 24 | 49.62% |
| <!-- | Total: | 45 |
| Totals | |
|---|---|
| Change from base Build 17599322795: | -0.005% |
| Covered Lines: | 39026 |
| Relevant Lines: | 71103 |
💛 - Coveralls
Walkthrough
Adds IPsec support and monitoring: new ipsec runner, Geneve tunnel discovery, netlink XFRM subscription, IPsec tunnel-state metric and registration, node integration and config flag, unit and e2e tests, metrics mocks concurrency improvements, Helm and CI adjustments, and dependency updates.
Changes
| Cohort / File(s) | Summary of change |
|---|---|
IPsec parsing & monitorgo-controller/pkg/metrics/ipsec.go, go-controller/pkg/metrics/ovnkube_controller.go |
New ipsec client type and parser for established IPsec Child SAs; MonitorIPsecTunnelsState added to subscribe to XFRM events and update ipsec_tunnel_state Prometheus metric. |
OVS Geneve discoverygo-controller/pkg/metrics/ovs.go |
Added TunnelInfo type and JSON-based listGeneveTunnels to enumerate Geneve interfaces (name, admin/oper state, options). |
Node metrics integrationgo-controller/pkg/metrics/node.go, go-controller/cmd/ovnkube/ovnkube.go |
RegisterNodeMetrics signature changed to accept a WaitGroup and return error; wires MonitorIPsecTunnelsState when enable-ipsec is configured and propagates errors. |
Metrics mocks & GaugeVecgo-controller/pkg/metrics/mocks/gauge.go |
Made GaugeMock concurrency-safe (mutex, pointer receivers); added GaugeVecMock with per-label gauges and interface-stubbed methods. |
IPsec & OVS testsgo-controller/pkg/metrics/ipsec_test.go, go-controller/pkg/metrics/ovs_test.go |
Added thread-safe fake IPsec and OVS clients, Ginkgo/Gomega tests exercising ipsec_tunnel_state behavior and fake clients with synchronized output mutation. |
Exec helpers (ipsec runner)go-controller/pkg/util/ovs.go, go-controller/pkg/util/ovs_unit_test.go |
Added ipsec binary discovery field and RunIPsec wrapper; tests updated to include LookPath calls and ipsec presence/absence scenarios. |
Config & constantsgo-controller/pkg/config/config.go, go-controller/pkg/types/const.go |
Added EnableIPsec feature flag and CLI flag; added metric name constants MetricIPsecEnabled and MetricIPsecTunnelIKEChildSAState. |
Helm changes (ipsec & env)helm/ovn-kubernetes/charts/ovn-ipsec/templates/daemonset.yaml, helm/ovn-kubernetes/charts/ovnkube-db-raft/templates/statefulset.yaml, helm/ovn-kubernetes/charts/ovnkube-db/templates/deployment.yaml |
Removed v3_req extension in CSR generation; added stale pluto.pid cleanup; renamed ENABLE_IPSEC → OVN_ENABLE_IPSEC in db components. |
CI, scripts & kind.github/workflows/test.yml, go-controller/hack/test-go.sh, contrib/kind.sh |
Added IPsec matrix entry and ENABLE_IPSEC env, prom2json install steps, included metrics package in root_pkgs tests, switched CSR listing to kubectl. |
E2E teststest/e2e/ipsec.go, test/e2e/util.go |
New e2e IPsec test validating ipsec-enabled and ipsec_tunnel_ike_child_sa_state metrics and helper to read ENABLE_IPSEC env. |
Dependency updatego-controller/go.mod |
Promoted github.com/vishvananda/netns to direct dependency. |
Sequence Diagram(s)
sequenceDiagram
autonumber
participant Node as ovnkube node
participant Register as RegisterNodeMetrics
participant OVS as ovs-vsctl runner
participant IPSec as ipsec runner
participant Netlink as XFRM multicast socket
participant Prom as Prometheus Gauge
Node->>Register: RegisterNodeMetrics(stop, wg)
alt enable-ipsec == true
Register->>Prom: register ipsec_tunnel_state gauge
Register->>Netlink: subscribe to XFRM groups
par initial evaluation
Register->>OVS: listGeneveTunnels()
Register->>IPSec: ipsec showstates
Register->>Prom: set gauge (0/1)
end
loop on XFRM events
Netlink-->>Register: state/policy event
Register->>OVS: listGeneveTunnels()
Register->>IPSec: ipsec showstates
Register->>Prom: update gauge
end
Node-->>Register: stop
Register->>Netlink: close socket
else
Register-->>Node: return (no IPsec monitor)
end
Estimated code review effort
🎯 4 (Complex) | ⏱️ ~60–90 minutes
Suggested labels
component/node, component/ovnkube-controller
Suggested reviewers
- tssurya
- martinkennelly
- trozet
Pre-merge checks (2 passed, 1 warning)
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | ⚠️ Warning | Docstring coverage is 34.48% which is insufficient. The required threshold is 80.00%. | You can run @coderabbitai generate docstrings to improve docstring coverage. |
✅ Passed checks (2 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title Check | ✅ Passed | The title succinctly and accurately summarizes the primary change introduced by this pull request—namely, adding a metric to track IPsec tunnel state for IKE Child SA establishment—and directly aligns with the PR’s main objective without extraneous detail. |
| Description Check | ✅ Passed | The description clearly explains that the PR registers the ovnkube_controller_ipsec_tunnel_ike_child_sa_state metric, describes its behavior when IKE Child SAs are established or not, and notes the implementation through xfrm netlink change events, which accurately reflects the changeset. |
Poem
A rabbit taps a metric drum—thump thump!
Geneve burrows lit, IPsec tunnels jump.
Pluto naps then wakes, pid files cleared away,
Gauges count the handshakes, zeroes flip to play.
Hop, hop—metrics bloom, a carrot-shaped hooray! 🥕
[!TIP]
👮 Agentic pre-merge checks are now available in preview!
Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
- Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
- Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.
Example:
reviews: pre_merge_checks: custom_checks: - name: "Undocumented Breaking Changes" mode: "warning" instructions: | Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post.
✨ Finishing Touches
- [ ] 📝 Generate Docstrings
🧪 Generate unit tests
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
/retest-failed
The following workflows runs were succesfully triggered: https://github.com/ovn-kubernetes/ovn-kubernetes/actions/runs/16023780242
The xfrm event subscription is failing in non-ic deployments due to permission issue and causing kind cluster bring up to fail. investigating...
The xfrm event subscription is failing in non-ic deployments due to permission issue and causing kind cluster bring up to fail. investigating...
There are permission issues in subscribing xfrm events in ovnkube-node pod in non-ic and multizone IC deployments. it's also needed to collect this metric as a node level metric, fixed that as well.
I dont have time to re-review this this week peri - under pressure.. btw some feedback - i subscribed to this because i thought it was simple metrics PR but it includes other things. Maybe split it into other PRs in the future. I am guilty of doing this too sometimes but its hard on the reviewer when so many things included and title misleading.
Needs simplifaction peri - its too complex as is imo and its hard to read the code imo.
I am removing myself from this Peri - i dont have time anymore :(