Enable per-source DestinationRule policies
Describe the bug We always apply the trafficPolicy at the root of the DestinationRules while we should apply it at the subset level
And we should automatically generate different subsets if there's a conflict
To Reproduce
cat << EOF | kubectl --context mgmt apply -f -
apiVersion: networking.mesh.gloo.solo.io/v1
kind: TrafficPolicy
metadata:
namespace: gloo-mesh
name: simple-1
spec:
sourceSelector:
- kubeWorkloadMatcher:
namespaces:
- default
destinationSelector:
- kubeServiceRefs:
services:
- clusterName: cluster1
name: reviews
namespace: default
policy:
trafficShift:
destinations:
- kubeService:
clusterName: cluster2
name: reviews
namespace: default
subset:
version: v3
weight: 75
- kubeService:
clusterName: cluster1
name: reviews
namespace: default
subset:
version: v1
weight: 15
- kubeService:
clusterName: cluster1
name: reviews
namespace: default
subset:
version: v2
weight: 10
EOF
cat << EOF | kubectl --context mgmt apply -f -
apiVersion: networking.mesh.gloo.solo.io/v1
kind: TrafficPolicy
metadata:
namespace: gloo-mesh
name: simple-2
spec:
sourceSelector:
- kubeWorkloadMatcher:
namespaces:
- istio-system
destinationSelector:
- kubeServiceRefs:
services:
- clusterName: cluster1
name: reviews
namespace: default
policy:
requestTimeout: 1s
outlierDetection:
consecutiveErrors: 10
interval: 10s
baseEjectionTime: 2m
maxEjectionPercent: 100
trafficShift:
destinations:
- kubeService:
clusterName: cluster2
name: reviews
namespace: default
subset:
version: v3
weight: 75
- kubeService:
clusterName: cluster1
name: reviews
namespace: default
subset:
version: v1
weight: 15
- kubeService:
clusterName: cluster1
name: reviews
namespace: default
subset:
version: v2
weight: 10
EOF
Expected behavior In this example, the outlierDetection should only apply to requests coming from the istio-system namespace.
But we create the DestinationRule like that:
spec:
host: reviews.default.svc.cluster.local
subsets:
- labels:
version: v1
name: version-v1
- labels:
version: v2
name: version-v2
trafficPolicy:
outlierDetection:
baseEjectionTime: 120s
consecutive5xxErrors: 10
interval: 10s
maxEjectionPercent: 100
tls:
mode: ISTIO_MUTUAL
We should have 4 subsets. Something like trafficpolicy-gloo-mesh-simple-1-version-1, trafficpolicy-gloo-mesh-simple-2-version-1 , ...
Additional context Add any other context about the problem here, e.g.
- Gloo Mesh version: 1.0
In this example, the outlierDetection should only apply to requests coming from the istio-system namespace.
To my knowledge, there's no way to have DestinationRule config apply selectively per source. We've indicated this for outlierDetection in the documentation, (we also need to mention this for TrafficPolicy's mTLS field)
Expanding on this issue:
Motivation
We've received some customer interest in the ability to defined DestinationRule config at a per-source granularity. Specifically, one customer has a use case where they need to define different ConnectionPool settings for different service consumers (i.e. for different traffic sources).
Proposed Approach
The goal is to provide a way for users to specify DestinationRule config at a per-source granularity. Unlike the VirtualService, which provides first class support for source filtering through the HttpMatchRequest object, the DestinationRule provides only the ability to define policies at a per subset level.
Working within this constraint, the approach to implementing per-source DestinationRule config would be the following:
-
In the DestinationRule translated for the host(s), Gloo Mesh creates a subset for each distinct TrafficPolicy sourceSelector, and attaches the policies under this subset.
-
In the VirtualService translated for the host(s), Gloo Mesh creates a traffic shift to the subset created above, applied to requests originating from workloads defined by the TrafficPolicy's sourceSelectors.
The VirtualService's traffic shift to the subset achieves the source filtering, and the DestinationRule's subset policy achieves the per-source DestinationRule config.
Example:
Expanding on Denis' example above (but removing the traffic shifts from the TrafficPolicy), Gloo Mesh would output the following:
- DestinationRule
spec:
host: reviews.default.svc.cluster.local
subsets:
# gloo mesh generated subsets
# verify that omitting labels will direct traffic to all backing workloads for the host
- name: gm-simple-1
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
- name: gm-simple-2
trafficPolicy:
outlierDetection:
baseEjectionTime: 120s
consecutive5xxErrors: 10
interval: 10s
maxEjectionPercent: 100
tls:
mode: ISTIO_MUTUAL
- VirtualService
spec:
hosts:
- reviews.bookinfo.svc.cluster.local
http:
- match:
- sourceNamespace: default
route:
- destination:
host: reviews.default.svc.cluster1
subset: gm-simple-1
- match:
- sourceNamespace: istio-system
route:
- destination:
host: reviews.default.svc.cluster1
subset: gm-simple-2
Caveats / questions:
-
We should validate that the semantics of per-subset DR policies + traffic shifting to the subset work in Istio as expected.
-
The auto-generated traffic shift may conflict with user-defined traffic shifts. This problem already exists today between user-defined policies, but this feature would introduce the possibility of conflict with generated traffic shifts. We need to make sure our conflict detection and error reporting is clear and robust.
@djannot I think the prob you described makes sense, the example is poor as Istio-system is also the root namespace, which means the config in that ns is applied to all namespaces.
@harveyxia @djannot my initial thoughts is to allow exportTo on DR to specify service like sourceNamespace/sourceService. This would help ensure the scope of the DR at the service level. No API change is needed except the semantics of allowing namespace/service vs just namespace today.
Alternatively, we could have selector on DR.
Note: there is some docs on DR loading works when you could have DR in source ns, target ns and root ns. I'm having trouble to find that doc atm.