contour icon indicating copy to clipboard operation
contour copied to clipboard

Mirroring doesn't work correctly with weights

Open jpeach opened this issue 4 years ago • 8 comments

I specified this proxy:

apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
  name: artifactory-v2
spec:
  routes:
    services:
    - name: artifactory-prod
      port: 80
      weight: 95
    - service: artifactory-dev
      port: 80
      weight: 4
    - service: artifactory-sampler
      port: 80
      weight: 1
      mirror: true

And my expectation was that 1% of traffic would be mirrored to artifactory-sampler. However, the envoy configuration ends up like this:

         "routes": [
            {
              "match": {
                "prefix": "/v2/token/"
              },
              "route": {
                "weighted_clusters": {
                  "clusters": [
                    {
                      "name": "default/artifactory-dev/80/da39a3ee5e",
                      "weight": 4
                    },
                    {
                      "name": "default/artifactory-prod/80/da39a3ee5e",
                      "weight": 95
                    }
                  ],
                  "total_weight": 99
                },
                "request_mirror_policy": {
                  "cluster": "default/artifactory-sampler/80/da39a3ee5e"
                }
              }
            },
            {
              "match": {
                "prefix": "/v1/token/"
              },
              "route": {
                "weighted_clusters": {
                  "clusters": [
                    {
                      "name": "default/artifactory-dev/80/da39a3ee5e",
                      "weight": 4
                    },
                    {
                      "name": "default/artifactory-prod/80/da39a3ee5e",
                      "weight": 95
                    }
                  ],
                  "total_weight": 99
                },
                "request_mirror_policy": {
                  "cluster": "default/artifactory-sampler/80/da39a3ee5e"
                }
              }
            }
          ]
        },

Which, IIUC, sends mirrors all the traffic from the route. If weights can't be combined with mirroring, we should emit a status condition to tell the user that. However, I think that we can look at the weights and automatically set a runtime_fraction for the mirror, so that this behaves the way it looks.

https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/route/route.proto#route-routeaction-requestmirrorpolicy

jpeach avatar Nov 12 '19 22:11 jpeach

See #459

jpeach avatar Nov 12 '19 22:11 jpeach

This is probably my mistake for reusing the service type for mirroring. I think the solution may be to define a mirror service type which lacks weighting.

This might be a minor compatibility break, so the non breaking alternative would be to simply ignore the weight field and document it as such.

On 12 Nov 2019, at 23:05, James Peach [email protected] wrote:

 I specified this proxy:

apiVersion: projectcontour.io/v1 kind: HTTPProxy metadata: name: artifactory-v2 spec: routes: services: - name: artifactory-prod port: 80 weight: 95 - service: artifactory-dev port: 80 weight: 4 - service: artifactory-sampler port: 80 weight: 1 mirror: true And my expectation was that 1% of traffic would be mirrored to artifactory-sampler. However, the envoy configuration ends up like this:

     "routes": [
        {
          "match": {
            "prefix": "/v2/token/"
          },
          "route": {
            "weighted_clusters": {
              "clusters": [
                {
                  "name": "default/artifactory-dev/80/da39a3ee5e",
                  "weight": 4
                },
                {
                  "name": "default/artifactory-prod/80/da39a3ee5e",
                  "weight": 95
                }
              ],
              "total_weight": 99
            },
            "request_mirror_policy": {
              "cluster": "default/artifactory-sampler/80/da39a3ee5e"
            }
          }
        },
        {
          "match": {
            "prefix": "/v1/token/"
          },
          "route": {
            "weighted_clusters": {
              "clusters": [
                {
                  "name": "default/artifactory-dev/80/da39a3ee5e",
                  "weight": 4
                },
                {
                  "name": "default/artifactory-prod/80/da39a3ee5e",
                  "weight": 95
                }
              ],
              "total_weight": 99
            },
            "request_mirror_policy": {
              "cluster": "default/artifactory-sampler/80/da39a3ee5e"
            }
          }
        }
      ]
    },

Which, IIUC, sends mirrors all the traffic from the route. If weights can't be combined with mirroring, we should emit a status condition to tell the user that. However, I think that we can look at the weights and automatically set a runtime_fraction for the mirror, so that this behaves the way it looks.

https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/route/route.proto#route-routeaction-requestmirrorpolicy

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

davecheney avatar Nov 13 '19 07:11 davecheney

On 13 Nov 2019, at 6:18 pm, Dave Cheney [email protected] wrote:

This is probably my mistake for reusing the service type for mirroring. I think the solution may be to define a mirror service type which lacks weighting.

This might be a minor compatibility break, so the non breaking alternative would be to simply ignore the weight field and document it as such.

I was expecting that most mirroring would need a percentage. Mirroring all the traffic for a site seems like overkill in most cases :)

jpeach avatar Nov 13 '19 08:11 jpeach

Excellent point. I guess maybe for the moment we just document that weighting isn’t implemented.

On 13 Nov 2019, at 09:54, James Peach [email protected] wrote:



On 13 Nov 2019, at 6:18 pm, Dave Cheney [email protected] wrote:

This is probably my mistake for reusing the service type for mirroring. I think the solution may be to define a mirror service type which lacks weighting.

This might be a minor compatibility break, so the non breaking alternative would be to simply ignore the weight field and document it as such.

I was expecting that most mirroring would need a percentage. Mirroring all the traffic for a site seems like overkill in most cases :) — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

davecheney avatar Nov 13 '19 10:11 davecheney

@jpeach I had a look in the docs and mirroring does support a runtime percentage. We could overload the definition of weight to mean "percentage" or come up with some logic for how the mirror cluster's weight plays into the others.

davecheney avatar Nov 18 '19 05:11 davecheney

On 17 Nov 2019, at 9:50 pm, Dave Cheney [email protected] wrote:

@jpeach I had a look in the docs and mirroring does support a runtime percentage. We could overload the definition of weight to mean "percentage" or come up with some logic for how the mirror cluster's weight plays into the others.

Yep, that's exactly what I had in mind :)

jpeach avatar Nov 18 '19 17:11 jpeach

Thinking about it a little more I’m not sure how to add mirroring percentage.

Fundamentally weighting and percentage are not the same, at least in the way the former is implemented in contour. The weight is defined as the numerator over the sum of all weights defined. This gives a relative percentage and is further complicated when the restrictions of the mirror service not contributing to the weight calculation.

What I’d like to propose is the current mirror: true syntax be deprecated and we add a new key to route, something like

route: mirror: ... service stuff percentage: nn // optional, defaults to 100

WDYT?

On 19 Nov 2019, at 04:39, James Peach [email protected] wrote:



On 17 Nov 2019, at 9:50 pm, Dave Cheney [email protected] wrote:

@jpeach I had a look in the docs and mirroring does support a runtime percentage. We could overload the definition of weight to mean "percentage" or come up with some logic for how the mirror cluster's weight plays into the others.

Yep, that's exactly what I had in mind :) — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

davecheney avatar Nov 19 '19 03:11 davecheney

We can define the behavior when mirror and routes are both present, we just need to be consistent and make it clear in the docs. Based on how mirroring is described in Contour docs today: “Per route, a service can be nominated as a mirror. The mirror service will receive a copy of the read traffic sent to any non mirror service”, in this case I would think traffic sent to artifactory-prod and artifactory-dev are both mirrored to artifactory-sampler? Is that not how it works?

As part of improvements to docs on mirroring, maybe we can add example of

Mirror services + weighting More than 1 mirror service on a route

xaleeks avatar Apr 13 '21 18:04 xaleeks

Hi team,

Any update on adding weights to mirroring config, similar to how Envoy supports runtime fraction. If yes, any reference doc would be helpful.

nageshs28 avatar Feb 13 '23 08:02 nageshs28

Hi team,

Any update on adding weights to mirroring config, similar to how Envoy supports runtime fraction. If yes, any reference doc would be helpful.

@nageshs28 there is no update here, we'd need some help resolving the semantics of weight when it applies to a mirror, since it's not currently equivalent to percentage (i.e. the sum of the weights need not be 100). If we can come up with a proposal for the API semantics that make sense, then this seems fairly easy to implement via runtime fraction.

IMO, overloading weight to mean percentage of traffic to mirror when applied to a Service with mirror: true could work as long as we clearly document this in the API docs and website. I would think it would then not be considered as part of the denominator when considering the split of traffic across non-mirror services.

skriss avatar Mar 21 '23 20:03 skriss

What about this?

  1. Introduce a new field called percentage inside the mirror object that would specify the percentage of traffic to be mirrored to the corresponding service.

  2. If the percentage field is not provided, it defaults to 100, meaning that all traffic to non-mirror services will be mirrored to the mirror service.

  3. Mirror services and non-mirror services have their own weight calculation which do not interfer with each other. Non-mirror services ought to sum up to 100% and mirror services get whatever is set as percentage.

If one wants to support multiple mirror services then maybe weight is better than percentage but that weight only applies within the set of mirror services

rrva avatar Apr 04 '23 10:04 rrva

@skriss what do you think of the proposed semantics?

rrva avatar Apr 13 '23 15:04 rrva

@rrva sorry for the delay here. The main issue is that there is not currently a Mirror object; it's just a boolean field that's defined on a particular service. Contour also only currently allows a single mirror service, though Envoy supports >1 per route.

I think roughly, our two options are:

  1. Overload weight to mean "percentage of total requests to also send to the mirror" for a service with mirror: true.
  2. Implement this proposal to add a proper mirror section to route, that can contain structured config for mirror services, including a percentage field.

1 has the benefit of being very simple to implement now, at the cost of some cognitive overhead for the user to understand the different meaning of weight for mirror vs. non-mirror services. 2 would replace the current mirror: true field, but given that the HTTPProxy API is at v1, we can't simply remove the existing mirror field, and would need to think through all of the versioning/compatibility/conversion issues.

IMO I'm leaning towards 1. Would like to get some other maintainers' thoughts as well, cc @sunjayBhatia @tsaarni

skriss avatar May 08 '23 18:05 skriss

Considering that the current behavior is to ignore users request to mirror only part of the traffic, it does not sound that bad to go with (1) and redefine weight as percentage, even when that is not what it was meant to be. It still likely achieves what the user wants, when property documented, and it avoids API deprecations which would be very confusing for the user as well.

tsaarni avatar May 10 '23 17:05 tsaarni

Is there a workaround to achieve the following (it can be a bit ugly). I want 10% of traffic to be mirrored, the rest only sent to the primary service. I need to use a httpproxy rule. My current rule looks like this:

apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
  name: myservice
spec:
  routes:
    - conditions:
        - prefix: /
      loadBalancerPolicy:
        strategy: WeightedLeastRequest
      services:
        - name: myservice
          port: 8080
        - name: myservice-stats
          port: 8080
          mirror: true

I want 10% to go to myservice-stats. I can go for a more complex solution as a workaround

rrva avatar May 22 '23 10:05 rrva

@rrva I do not believe there's any way to accomplish this today via Contour, we'll need to make a code change here.

skriss avatar May 22 '23 16:05 skriss

The issue I want to prevent now is that the mirror service gets overloaded. It is scaled way less than the primary service on purpose. What if I modify the queueing for it to do load shedding? I was thinking of

projectcontour.io/max-pending-requests: The maximum number of pending requests that a single Envoy instance allows to the Kubernetes Service; defaults to 1024.

rrva avatar May 22 '23 16:05 rrva