smi-spec icon indicating copy to clipboard operation
smi-spec copied to clipboard

Clarifications on traffic split

Open abonas opened this issue 5 years ago • 5 comments

I'm referring to this document: https://github.com/deislabs/smi-spec/blob/master/traffic-split.md

  1. Can someone clarify this statement? "Weighting traffic between various services is also more generally useful than driving canary releases." what is the difference between traffic weights and canary releases that is meant here? why is one more useful over another?

  2. "The resource is associated with a root service" - what is a root service?

  3. In this example: https://github.com/deislabs/smi-spec/blob/master/traffic-split.md#specification 3 services receive the following traffic distribution: 10m, 100m, 1500m. What is "m"? what do 10,100, 1500 mean in relation to each other? How do they form 100% of the traffic?

  4. In this section: https://github.com/deislabs/smi-spec/blob/master/traffic-split.md#workflow there is an example with traffic split of 1 and 0m. what does 1 mean? all traffic? it also doesn't have the "m" suffix while zero does have it

  5. "Weights vs percentages - the primary reason for weights is in failure situations. For example, if 50% of traffic is being sent to a service that has no healthy endpoints - what happens? Weights are simpler to reason about when the underlying applications are changing." - can you elaborate more? I still don't see the benefit in using weights despite the example.

abonas avatar May 22 '19 12:05 abonas

Thanks @abonas for taking time to go through the spec, here is what I could elaborate on:

  1. "The resource is associated with a root service" - what is a root service?

The root service is where everyone is sending traffic to. Like if you are rolling out new version of your application webapp using TrafficSplit. And everyone talking to webapp, uses the URL http://webapp to talk to it.

Even though going further you might have various other services like webapp-v1 and webapp-v2. The root service here is still webapp.

  1. In this example: https://github.com/deislabs/smi-spec/blob/master/traffic-split.md#specification 3 services receive the following traffic distribution: 10m, 100m, 1500m. What is "m"? what do 10,100, 1500 mean in relation to each other? How do they form 100% of the traffic?

Here the specification tries to align itself with the way kubernetes defines it's resources. So basically 1500m == 1.5, 100m == 0.1, etc. In the Istio implementation of TrafficSplit we add those all up and calculate the relative percentage to form 100%.

  1. In this section: https://github.com/deislabs/smi-spec/blob/master/traffic-split.md#workflow there is an example with traffic split of 1 and 0m. what does 1 mean? all traffic? it also doesn't have the "m" suffix while zero does have it

Yes 1 means entire traffic sent to it. The m with zero has no sense in general terms because it is zero.

surajssd avatar May 27 '19 05:05 surajssd

Thanks @abonas for taking time to go through the spec, here is what I could elaborate on:

  1. "The resource is associated with a root service" - what is a root service?

The root service is where everyone is sending traffic to. Like if you are rolling out new version of your application webapp using TrafficSplit. And everyone talking to webapp, uses the URL http://webapp to talk to it.

thanks for your explanation! the definition of a root service should be clearly stated as part of the spec. see issue #44.

Even though going further you might have various other services like webapp-v1 and webapp-v2. The root service here is still webapp.

  1. In this example: https://github.com/deislabs/smi-spec/blob/master/traffic-split.md#specification 3 services receive the following traffic distribution: 10m, 100m, 1500m. What is "m"? what do 10,100, 1500 mean in relation to each other? How do they form 100% of the traffic?

Here the specification tries to align itself with the way kubernetes defines it's resources.

why does it need to align? resources in k8s are not the same as the traffic flowing through it.

So basically 1500m == 1.5, 100m == 0.1, etc. In the Istio implementation of TrafficSplit we add those all up and calculate the relative percentage to form 100%.

the math here is still unclear to me. how do those numbers sum up to 100%? it's not intuitive. Moreover, even the Istio example itself has percentage in titles and then it's converted to "m" units - so user needs to constantly convert percentages to some other units. That's not a good user experience IMO.

  1. In this section: https://github.com/deislabs/smi-spec/blob/master/traffic-split.md#workflow there is an example with traffic split of 1 and 0m. what does 1 mean? all traffic? it also doesn't have the "m" suffix while zero does have it

Yes 1 means entire traffic sent to it. The m with zero has no sense in general terms because it is zero.

sometimes entire traffic is referred to as 1000, sometimes 100, sometimes 1. The spec is not consistent and IMO makes things more complex than needed.

abonas avatar May 27 '19 10:05 abonas

@abonas care to take a swing at getting a list of terms to be defined in the Glossary? And then we could all jump in and clearly define them?

christian-posta avatar Jun 03 '19 16:06 christian-posta

@abonas Istio uses percentages, we purposely avoided percentages and went for weights instead (see the tradeoffs section for more details). Because of this, you don't need anything to add up to 100, or 1, or really anything. The weights are relative and extremely configurable on purpose. There's no reason you need to use the m terminology if you don't want to, whole numbers work as well.

grampelberg avatar Jun 03 '19 16:06 grampelberg

@abonas care to take a swing at getting a list of terms to be defined in the Glossary? And then we could all jump in and clearly define them?

@christian-posta https://github.com/deislabs/smi-spec/issues/44

abonas avatar Jun 03 '19 19:06 abonas