smi-spec `Matches` in `TrafficSplit` is unclear

In TrafficTarget, we refer to Rules (which "are traffic specs that define what traffic for specific protocols would look like. The kind can be different depending on what traffic a target is serving. In the following examples, HTTPRouteGroup is used for applications serving HTTP based traffic"). A rule has an optional matches field in case one wants to match on only one match condition defined in the TrafficSpec (HTTPRouteGroup) referenced in the rule. In TrafficSplit, we only have a Matches field, no Rules and Matches refers to a slice ofTypedLocalObjectReference, so you can refer to HTTPRouteGroup in the same namespace but we cannot refer to a specific match condition defined in the HTTPRouteGroup. This is slightly confusing and inconsistent. Do we want to be able to specify a match condition in a Traffic Split as well?

Related: https://github.com/servicemeshinterface/smi-spec/issues/190#issuecomment-801213434

Mar 17 '21 16:03 michelleN

I think this makes perfect sense expanding on the example in the doc that allows a canary to be created for firefox users.

kind: TrafficSplit
metadata:
  name: ab-test
spec:
  service: website
  matches:
  - kind: HTTPRouteGroup
    name: ab-test
    matches: 
      - firefox-users
  backends:
  - service: website-v1
    weight: 0
  - service: website-v2
    weight: 100

---
kind: HTTPRouteGroup
metadata:
  name: ab-test
matches:
- name: firefox-users
  headers:
  - user-agent: ".*Firefox.*"
 - name: safari-users
  headers:
  - user-agent: ".*Safari.*"

I wonder should we refactor to be more in line with TrafficAccess?

From

kind: TrafficSplit
metadata:
  name: ab-test
spec:
  service: website
  matches:
  - kind: HTTPRouteGroup
    name: ab-test
    matches: 
      - firefox-users
  backends:
  - service: website-v1
    weight: 0
  - service: website-v2
    weight: 100

To

kind: TrafficSplit
metadata:
  name: ab-test
spec:
  service: website
  rules:
  - kind: HTTPRouteGroup
    name: ab-test
    matches: 
      - firefox-users
  backends:
  - service: website-v1
    weight: 0
  - service: website-v2
    weight: 100

Mar 17 '21 17:03 nicholasjackson

@michelleN @nicholasjackson I hope I can suggest some inputs here, I like the design proposed by Nick. I have one minor input on that, basically to add backends to every match so that there can be multiple destinations and a default destination. Please advice if this is inappropriate.

Nick's Version:

kind: TrafficSplit
metadata:
  name: ab-test
spec:
  service: website
  rules:
  - kind: HTTPRouteGroup
    name: ab-test
    matches: 
      - firefox-users
  backends:
  - service: website-v1
    weight: 0
  - service: website-v2
    weight: 100

Suggested Change:

kind: TrafficSplit
metadata:
  name: ab-test
spec:
  service: website
  rules:
  - kind: HTTPRouteGroup
    name: ab-test
    matches: 
      - firefox-users
    backends:
     - service: website-v3
        weight: 50
  backends:
  - service: website-v1
    weight: 0
  - service: website-v2
    weight: 100

Mar 22 '21 05:03 kumarabd

Thanks @nicholasjackson and @kumarabd. Thanks for the suggestions. The benefits I see of Nic's version is that there is greater clarity and less room for error when building and validating a traffic split. Having one traffic split per set of backends allows for the same pattern as TrafficTarget which is a set of sources that have a set of rules (with match conditions) for reaching a destination. The equivalent of that pattern in TrafficSplit would be having all traffic with a set of rules (with match conditions) reach a set of backends (destination).

The tradeoff is the usability in @kumarabd's proposal which allows for one traffic split to contain essentially all the logic for splitting traffic to that service to any set of backends. If there is a single source of truth (one traffic split), there is only one split someone needs to modify. As a user it may be tedious to figure out which traffic splits to remove/update and in what order for the desired outcome. The set of backends under rules and also under spec makes the intended outcome slightly confusing to me. I think what that means is that if traffic does not match any of the rules then implicitly route traffic to the default set of backends. Is that correct @kumarabd ? The implicit configuration worries me slightly but I'd love for more folks to weigh in here.

Apr 01 '21 14:04 michelleN

Yes, Absolutely. To your point,

It will be tedious to figure out which traffic splits to update.
Even harder to figure out the priority/order of the traffic splits, and thus the desired outcome.
We could have a default set of backends, in the case if no rules match.

I see some merit here in terms of predicting the behaviour. // @michelleN @nicholasjackson

Apr 01 '21 16:04 kumarabd

From an implementation perspective, I wonder if it would be better to have a default set of backends or remove the option of the default backends and only apply backends to a specific set so that there is no implicit behavior and everything is explicit? My gut says explicit is better since we have different proxies in our community and these subtleties may make a large impact when it comes to expected behavior, but I definitely understand and empathize with the issues @kumarabd pointed out. Inspired by both of your examples, what do ya'll think about something like this:

kind: TrafficSplit
metadata:
  name: ab-test
spec:
  service: website
  splits:
    - rules:
      - kind: HTTPRouteGroup
        name: ab-test
        matches: 
          - firefox-users
      backends:
        - service: website-v1
          weight: 50
        - service: website-v2
          weight: 50
    - rules:
      - kind: HTTPRouteGroup
        name: ab-test
        matches: 
          - chrome-users
      backends:
        - service: website-v3
          weight: 20
        - service: website-v4
          weight: 80

Apr 01 '21 18:04 michelleN

Looks good to me, would allow you to define all the properties of a test in a single CRD. 🚢

Apr 01 '21 19:04 nicholasjackson

Looking at this. It shouldn't even break the sdk/Kubernetes API versioning requirements. Do you see any issues @nicholasjackson ?

Apr 05 '21 22:04 michelleN

Perhaps we should also include backends for non-matches (which we don't have) and backends for matches (which we have).

Jun 21 '21 20:06 johnsonshi

I think it would be fine @michelleN, moving from this version to the previous version would mean you have to drop anything other than the first rule but moving from a previous one to this would be unaffected. I think this is a happy compromise in the name of progress.

Jun 22 '21 09:06 nicholasjackson

I really like your examples and thought about them, but I having a key for explicit default backends would be less confusing. Right now, if a TrafficSplit does not match any of the match rules (say a non-Firefox/Chrome user like a Safari user), then when a request is made to website, the Safari user will be routed to whatever the website service points to. Explicitly allowing a default split would be less confusing in my opinion.

Would love your thoughts on this @nicholasjackson @kumarabd

Examples

kind: TrafficSplit
metadata:
  name: ab-test
spec:
  service: website
  splits:
    - rules:
      - kind: HTTPRouteGroup
        name: ab-test
        matches: 
          - firefox-users
      backends:
        - service: website-v1
          weight: 50
        - service: website-v2
          weight: 50
    - rules:
      - kind: HTTPRouteGroup
        name: ab-test
        matches: 
          - chrome-users
      backends:
        - service: website-v3
          weight: 20
        - service: website-v4
          weight: 80
    - default:
      backends:
        - service: website
          weight: 100

kind: TrafficSplit
metadata:
  name: ab-test
spec:
  service: website
  splits:
    - rules:
      - kind: HTTPRouteGroup
        name: ab-test
        matches: 
          - firefox-users
      backends:
        - service: website-v1
          weight: 50
        - service: website-v2
          weight: 50
    - rules:
      - kind: HTTPRouteGroup
        name: ab-test
        matches: 
          - chrome-users
      backends:
        - service: website-v3
          weight: 20
        - service: website-v4
          weight: 80
    - default:
      backends:
        - service: website-v1
          weight: 20
        - service: website-v2
          weight: 80

Jun 29 '21 16:06 johnsonshi

@johnsonshi Your proposal makes sense to me.

We should think about how to handle conflicts between traffic splits that reference the same root service. Should there only be 1 traffic split per root service? I'm not sure that's a good idea but just throwing out an example.
Since several meshes are already using Traffic Split, I think making this big of a change should be postponed till v2 of traffic split. We should try to get the version of traffic split we currently have to v1 imo. Objections welcome.

On a different note, I've been thinking more about this. As I tried to go and just rename things, I came to the conclusion that if the spec.matches in HTTPRouteGroup was actually spec.routes (which is how it is referred to in the traffic specs spec), references the routes would be much cleaners in Traffic Split. We could keep spec.matches in Traffic Split and references routes under each match. It would look like:

kind: TrafficSplit
metadata:
  name: ab-test
spec:
  service: website
  matches:
  - kind: HTTPRouteGroup
    name: ab-test
    routes: 
      - firefox-users
  backends:
  - service: website-v1
    weight: 0
  - service: website-v2
    weight: 100

---
kind: HTTPRouteGroup
metadata:
  name: ab-test
spec:
  routes:
  - name: firefox-users
     headers:
     - user-agent: ".*Firefox.*"
  - name: safari-users
     headers:
     - user-agent: ".*Safari.*"

What do ya'll think? @nicholasjackson @kumarabd @johnsonshi

PR for traffic specs changes here: https://github.com/servicemeshinterface/smi-spec/pull/229

Aug 11 '21 03:08 michelleN

@michelleN I agree with the fact that the changes involved is quite big and that should be covered in the later versions. To your point 1, I think there could be multiple traffic split per root service. They can be chained logically with their creation timestamps or have a priority field introduced.

@johnsonshi Your proposal looks a lot cleaner. I was wondering if it will be a challenge to have default as a part of the same array. A separate member makes more sense to me. I still want to hear more about this from the rest if there is a better way to do it.

// @nicholasjackson

Aug 12 '21 18:08 kumarabd

smi-spec smi-spec copied to clipboard

`Matches` in `TrafficSplit` is unclear

smi-spec
smi-spec copied to clipboard