Kuadrant CR status reports mTLS enabled when it's not working on OCP 4.19+
Problem
The Kuadrant CR status incorrectly reports that mTLS is enabled when it's actually not working.
What We See
The Kuadrant status shows:
-
status.mtlsAuthorino: true -
status.mtlsLimitador: true -
status.conditions[type=Ready].status: True
However, mTLS connectivity is broken and policies are failing silently.
Expected Behavior
If the status reports mTLS as enabled, it should actually be working.
Example Where This Happens
- Platform: OpenShift 4.19+ with managed Gateway API
- Root cause: The Gateway API installation method is different in OCP 4.19+, which breaks mTLS configuration
- Problem: The Kuadrant status still reports mTLS as enabled despite it not working
After investigating the behavior on OpenShift 4.19, we have identified the root cause and a workaround.
Here is a summary of the findings and the immediate corrective actions available.
Root Cause Analysis
The issue stems from how the Cluster Ingress Operator (CIO) in OpenShift 4.19 manages the default Istio instance (OSSM v3.x) in the openshift-ingress namespace.
-
The Constraint: On OpenShift, standard istio-init containers do not comply with default Security Context Constraints (SCCs). Therefore, successful sidecar injection relies on the Istio CNI agent running on every node.
-
The Configuration Clash: The Istio instance managed by the CIO has CNI disabled by default:
spec:
values:
pilot:
cni:
enabled: false
- The Result: Because CNI is disabled, the sidecar injection fails for the Rate Limiting service. When Kuadrant enables mTLS:
- The Gateway attempts to initiate a TLS connection.
- The Rate Limiting (or Auth) service (lacking a sidecar to terminate TLS) rejects the connection.
- The protection policies fail silently.
- Why the Status reports "Ready": Kuadrant reports the status as Ready and Effective because, from the Control Plane perspective, the policy resources were generated and applied successfully. The failure occurs strictly at the data plane network level (connection rejection), which the current status logic does not detect.
Immediate Workaround
To unblock your environment on OpenShift 4.19, you can deploy a custom instance of Istio with CNI enabled.
Steps:
- Remove the GatewayClass resource where controllerName is set to
openshift.io/gateway-controller/v1to prevent conflicts. - Remove the Istio instance managed by the Ingress Controller.
- Deploy your own Istio instance ensuring CNI is enabled in the configuration.
Note: ~A theoretical workaround involves enabling CNI on the CIO-managed instance. I could not find a documented way, though. My attempts were overwritten by the operator.~
EDIT: CNI is not available either.
Resolution
-
Update Kuadrant documentation to note that if mTLS (mutual TLS) is required, the Cluster Operator Ingress (CIO) managed Istio is not a viable option because it lacks the necessary mesh capabilities. Kuadrant's mTLS feature relies on these capabilities. Therefore, you must create a custom Istio CR (Custom Resource) with the CNI (Container Network Interface) requirement enabled. The Gateway API can also be enabled on this custom Istio CR. Crucially, when defining your Gateways, please ensure they avoid using the openshift.io/gateway-controller/v1 controller name. This prevents the Cluster Ingress Operator from attempting to manage resources for your custom Istio control plane.
-
[Nice to have] Look into better observability to ensure Kuadrant status reflects data plane connectivity failures (detecting the silent failure).
-
~Long term: Explore Istio Ambient Mesh. In theory, Ambient Mesh can achieve mTLS between pods without relying on the sidecar injection mechanism that is currently failing in this specific OCP configuration. Check https://www.redhat.com/en/blog/introducing-openshift-service-mesh-32-istios-ambient-mode~
EDIT: Istio Ambient Mesh is not available either.
I have edited the message above. TL;DR: this issue is resolved with documentation.
Documenting the workaround/different way of Istio deployment solves the problem of mtls not being able to be used on ocp419+ But I would say issue still stands. Do we want the Kuadrant CR to report mtls feature being ready even though OSSM is configured without sidecar and CNI support? I would say its not unreasonable to assume user would set up OSSM on ocp419 in the recommended Gateway API way in Openshift docs.
I agree, that's the [Nice to have] point described above