kuadrant-operator Add e2e tests for mTLS Configuration via Kuadrant CR

Description

Implement end-to-end tests in the Kuadrant/testsuite repository to verify expected system behavior and configuration, from a user's perspective, when toggling mTLS in the Kuadrant CR.

Test Cases

Functional Behaviour

Requests succeed when mTLS is disabled
- Kuadrant CR is configured with:
```
spec:
  mtls:
    enable: false
```
- AuthPolicy or RateLimitPolicy is in place
- Requests return expected responses (e.g. 403, 429)
Requests succeed when mTLS is enabled
- Kuadrant CR is patched to enable mTLS:
```
spec:
  mtls:
    enable: true
```
- Requests continue to work as expected, confirming that Gateway communication with data plane components works correctly over mTLS
Requests still succeed after mTLS is turned off again
- Kuadrant CR is patched to disable mTLS
- Requests continue to work as expected

mTLS Configuration Validation

PeerAuthentication resource is applied
- When mTLS is enabled, a PeerAuthentication resource exists in the kuadrant-system namespace with:
```
spec:
  mtls:
    mode: STRICT
```
- When mTLS is disabled, this resource should be removed or absent
Pod Labels and Sidecar Injection
- When mTLS is enabled, pods for Authorino and Limitador should have Istio sidecars injected (e.g. sidecar.istio.io/inject: "true")
- Authorino and Limitador pods should also carry the label kuadrant.io/managed: "true"
Kuadrant CR reaches Ready status
- After enabling mTLS, the Kuadrant CR should enter a Ready state, indicating that configuration has been successfully applied

Mar 24 '25 12:03 emmaaroche

I was reading about this and made me wonder: Does it make sense to verify that the communication between the gateway and kuadrant components effectively happens on a secured TLS channel?

Not saying that the verification steps described should not be done. I think they are good to have. However, I see them like implementation details. More importantly, the feature from the user perspective is that the communication is secured, thus that should be tested. wdyt?

How to test that a communication channel is secured with TLS is another discussion. First let's clarify what and then we think about how

Apr 10 '25 10:04 eguzki

@eguzki I agree, making sure the communication is actually secured with TLS is really the main thing from a users perspective, so it makes sense to test that for sure.

The test cases I described are a first draft to get the ball rolling, so will be further refining what should be tested, and as you said, once we're clear on that, the how can be figured out!

Apr 11 '25 07:04 emmaaroche

As we discussed, checking if the communication between pods is really encrypted is out of scope for our testing. We will believe Istio system is implemented correctly and just check for existance of PeerAuthentication resource with correct labels applied to affected pods.

Apr 23 '25 08:04 azgabur

@eguzki I have updated the test cases with more focus on end-to-end behavior. Regarding secure communication between the Gateway and Kuadrant components, Alex and I discussed this and agreed that we’ll focus on validating that the required Istio configuration for mTLS is correctly applied, rather than verifying the communication directly. As Alex mentioned in his comment above, this kind of verification is out of scope for our testing. Since Kuadrant relies on Istio for mTLS, confirming that the correct resources (like PeerAuthentication and sidecar injection) are in place should be a reliable way to verify that the feature is enabled and functioning as expected.

Would appreciate a review of the updated issue when you have a moment. Thanks!

Apr 23 '25 09:04 emmaaroche

Fair enough.

I agree that we should not be testing Istio's mTLS solution. That's assumed. Istio team will have their own tests for that. My fear was that after setting up many Istio resources and configuration, it looks like comms are on top of TLS connections while in reality it is not.

What about trying to access limitador and/or authorino endpoints with plain gRPC and validate connections are rejected?

Apr 23 '25 13:04 eguzki

In Openshift that would require exposing some authorino/limitador service via route object as there is no other way to access internal cluster IP from outside of cluster for the check. Not sure how we could do that in kind, maybe by adding new LoadBalancer service?

Which service, gprc port and endpoint would be good to use for such check for Authorino and Limitador? Would such combination be stable in future? @eguzki If easy enough adding such check could be fine even if a bit redundant.

In any case if all those cases in mTLS Configuration Validation section are checked it should mean the Istio is correctly set up and therefore Istio will ensure that all communication in the mesh towards those pods would need to be encrypted. At least that's how I understand the functionality contract provided by Istio object PeerAuthentication But maybe I am mistaken and there are ways to bypass it. Also to actually check that communication between Gateway pod and Kuadrant pods is encrypted would require somehow catching the packets in the Istio mesh and analyzing them which is indeed out of scope for this issue and we should just depend on the Istio implementation to be correct.

Apr 23 '25 14:04 azgabur

Just an idea: you could run a k8s job that has grpcurl client installed and does a custom request like this to the exposed gRPC service:

grpcurl -plaintext -d @ LIMITADOR_SERVICE:LIMITADO_PORT envoy.service.ratelimit.v3.RateLimitService.ShouldRateLimit <<EOM
{
    "domain": "test_namespace",
    "hits_addend": 1,
    "descriptors": [
        {
            "entries": [
                {
                    "key": "req.method",
                    "value": "POST"
                },
                {
                    "key": "req.path",
                    "value": "/"
                }
            ]
        }
    ]
}
EOM

Additionally, there are few REST HTTP endpoints like GET /limits/${namespadce}, GET /counters/${namespace}, GET /metrics, (full OpenAPI spec here which might be outdated). But I would test the kuadrant data plane gRPC endpoints that are ultimately what matters.

The job would fail when connection could not be established. From your testsuite, you would check job's status. The status should report failed job (for the specific error of connection could not be established) for the test to succeed.

Same idea for authorino, but a different endpoint based on service.auth.v3.CheckRequest

Anyway, just giving ideas to ensure comms are secured. As said, no need to go that far.

Apr 23 '25 15:04 eguzki