vitess icon indicating copy to clipboard operation
vitess copied to clipboard

Tablet throttler: multi-metric support

Open shlomi-noach opened this issue 1 month ago • 9 comments

Fixes https://github.com/vitessio/vitess/issues/15624

Preface

This PR adds multi-metrics support to the tablet throttler. It is pretty large as will be explained shortly, and is submitted as Draft. I will break it down into hopefully smaller and more manageable PRs, something I could not do in the process of writing this code. To explain a bit, adding multi-metrics support adds a new dimension of complexity to the already multi-dimensioned throttler. I chose do remove legacy or unused dimensions from the existing codebase so as to simplify the result. While this PR is not a redesign (the main elements remain the same), it shaves and refactors a lot of code.

This PR comment will detail all changes made, in a form of documentation. I expect this PR comment to later serve as the basis to documentation updates. We'll take a step back, define some concepts, explain how these concepts connect or interact, what user interface we have, and then also discuss internals.

Objective

Requested by multiple production scenarios, we wish to be able to, for example, kick throttling based on both lag as well as load average for some workflows, while allowing others to throttle based on replication lag only, and whilst completely rejecting (or alternatively completely exempting) all other requests.

We seek a fine grained approach that is still maintainable and comprehensible

Timeline

While v20 is aroudn the corner let's not rush it. Not expecting this PR to land in v20 (glad if it will, but only if it happens to).

Breakdown PRs

This branch (throttler-multi-metrics) will remain unmerged, and serve as the basis to followup incremental PRs. But I do think the functionality should be merged as a whole. I'm suggesting I'll create a new throttler-multi-metrics-incremental branch, and merge the incremental PRs into that branch. Finally, we will merge throttler-multi-metrics into main, at which time this branch, throttler-multi-metrics will become implicitly merged.

Backwards compatibility

The changes are backwards compatible with single-metric throttlers, in both direction: multi-metric Primary to single-metric replica, and single-metric Primary to multi-metric replica.

Tests

Added entities and functionalities are all tested, the vast majority via unit testing. The throttler_test.gp unit test now operated a full blown throttler, mocking a topology server, and mocking replica results or can act as a replica. The case for endtoend is now mostly in testing the full flow through vtctldclient as well as controlling an actual lagging replica.

Dcoumentation

Concepts

Tablets

The throttler runs as part of the tablet server. The throttler can be disabled or enabled, based on the tablet throttler configuration as part of the Keyspace or SrvKeyspace in the topo service. All tablets sharing the same keyspace read the same throttler configuration. Thus, all tablet throttlers are all enabled or all disabled, irrespective of shards and tablet types.

Tablets in the same shard collaborate. The Primary tablet polls the replica tablets, and replica tablets report and sometimes push throttler notifications to the Primary.

However, we limit the collaboration to specific tablet types, based on --throttle_tablet_types VTTablet flag. By default, the Primary only collaborates with replica tablet types, which means tablets such as backup do not affect any throttling behavior.

Metrics

The objective of the throttler is to push back work based on database load. Previously, this was done based on a single metric, which could be either the replication lag, or the result of a custom query. Now, the throttler collects multiple metrics. The current supported metrics are:

  • Replication lag (lag), measured in seconds.
  • Load average (loadavg), per core, on the tablet server/container.
  • MySQL Threads_running value (threads_running).
  • Custom query (custom) as defined by the user.

This list is expected to expand in the future.

All metrics are float64 values, and are expected to be non-negative. Metrics are identified by names (lag, loadavg, etc.)

Thresholds

A metric value can be good or bad. Each metric is assigned a threshold. Below that threshold, the metric is good. As of the threshold (equal or higher), the metric is deemed bad. The higher the metric, the worse it is.

Each metric has a "factory default" threshold, e.g.:

  • 5 (5 seconds) for lag.
  • 1.0 (per core) for loadavg.
  • 100 for threads_running.

Thresholds are positive values. A threshold of 0 is considered undefined.

The user can set their own thresholds, overriding the factory defaults. The user defined thresholds are persisted as part of the throttler configuration under the Keyspace entry in the topo service.

Scopes

We can observe metrics in two scopes: self, or shard.

Each tablet's throttler collects metrics from its own tablet and from the MySQL server operated by the tablet. Each tablet then refers to those metrics in the self scope.

The Primary tablet further collects metrics from shard tablets (limited by throttle_tablet_types flag as mentioned above). It then uses the maximum (read: worst) value collected, including its own, as the shard metric value.

We can therefore refer scoped metrics. On any tablet, we can query for self or shard metrics:

  • self/loadavg: the load average on a specific tablet.
  • self/lag: the lag on a specific tablet. While this makes most sense to query on a replica, it's also an indicative value on the Primary. The throttler measures lag using heartbeat injection. In the case of extremely high workload, this value can be indicative of transaction commit latencies.
  • shard/lag: when querying the Primary, this return the highest replication lag across the shard. A replica does not have the collective metrics across the shard, and the value effectively equals self/lag.

Each metric has a default scope:

  • lag defaults to shard scope.
  • All other metrics default to self scope.

Querying a Primary tablet for the lag metric is therefore equal to querying for shard/lag, and querying for threads_running equals to querying for self/threads_running.

For backwards compatibility, it is also possible to query for the self or for the shard metrics, in which case the result is based on either the lag metric (if custom-query is undefined) or the custom metric (if custom-query is defined).

Apps

A client that connects to the throttler and asks for throttling advice identifies itself as an "app" (legacy term from a previous incarnation). Example apps are VReplication or the Table Lifecycle. Apps identify by name. Examples:

  • vreplication: any VReplication workflow.
  • tablegc: table lifecycle.
  • online-ddl: any Online DDL operation, whether Vitess or gh-ost.
  • vplayer: a submodule of VReplication.
  • schema-tracker: the internal schema tracker.

Some app names are special:

  • vitess: used by the throttlers themselves, when the Primary checks the shard replicas, or when a throttler checks itself.
  • always-throttled-app: useful for testing/troubleshooting, an app whose checks the throttler will always reject.
  • test: used in testing.
  • all: a catch-all app, used by app rules and app metrics (see below). If defined, it applies to any app that doesn't have any explicit rules/metrics.

Clients can identify by multiple app names, separated with colon. For example, the name vcopier:d666bbfc_169e_11ef_b0b3_0a43f95f28a3:vreplication:online-ddl stands for:

  • An Online DDL,
  • That uses vreplication strategy,
  • With a d666bbfc_169e_11ef_b0b3_0a43f95f28a3 workflow ID,
  • Currently issuing rowcopy via vcopier.

The throttler treats such an app as the combined check of multiple apps, to each it will apply app metric and app rules, as discussed below.

Checks

A check is a request made to the throttler, asking for go/no-go advice. The check identifies by an app name (defaults vitess). The throttler looks at the metrics assigned to the app (see below). If all of them are below their respective thresholds, the throttler accepts the request (returns an OK response). If any of those exceed their respective threshold, the throttler rejects the request (returns a non-OK response).

Checks are made internally by the various vitess components, and the responses are likewise analyzed internally. The user is also able to invoke a check, for automation or troubleshooting purposes. For example:

$ vtctldclient --server localhost:15999 CheckThrottler --app-name "vreplication" zone1-0000000101  | jq .
{
  "status_code": 200,
  "value": 0.607775,
  "threshold": 5,
  "error": "",
  "message": "",
  "recently_checked": true,
  "metrics": {
    "lag": {
      "name": "lag",
      "status_code": 200,
      "value": 0.607775,
      "threshold": 5,
      "error": "",
      "message": "",
      "scope": "shard"
    }
  }
}

The response includes:

  • Status code (based on HTTP responses, ie 200 for "OK")
  • Any error message
  • The list of metrics checked; for each metric:
    • Its status code
    • Its threshold
    • The scope it was checked with

How concepts are combined and used

Metric thresholds

Each metric is assigned a threshold. Vitess supplies factory defaults for these thresholds, but the user may override them manually, like so:

$ vtctldclient UpdateThrottlerConfig --metric-name "loadavg" --threshold "2.5" commerce

In this example, the loadavg metric value is henceforth deemed good if below 2.5. The threshold is stored as part of the keyspace entry in the topo service:

$ vtctldclient GetKeyspace commerce | jq .keyspace.throttler_config.metric_thresholds
{
  "loadavg": 2.5
}

The threshold applies to any check for that specific metric (see App Metrics, below) on any tablet in this keyspace. The value of the metric is also reflected in the throttler status:

$ vtctldclient GetThrottlerStatus zone1-0000000101  | jq .metric_thresholds
{
  "config/loadavg": 2.5,
  "custom": 0,
  "default": 5,
  "lag": 5,
  "loadavg": 2.5,
  "threads_running": 100
}

Use a 0 threshold value to restore the threshold back to factory defaults.

App Metrics

By default, when an app checks the throttler, the result is based on replication lag. If the custom query is set, then the result is based on the custom query result. It is possible to assign specific metrics to specific apps, like so:

$ vtctldclient UpdateThrottlerConfig --app-name "online-ddl" --app-metrics "lag,threads_running" commerce

From that moment on, Online DDL operations will throttle on both high lag as well as on high threads_running. If either these values exceeds its respective threshold, Online DDL gets throttled. However, it's important to note the scope of the metrics, which is left to the defaults here. To elaborate, it is possible to further indicate metric scopes, for example:

$ vtctldclient UpdateThrottlerConfig --app-name "online-ddl" --app-metrics "lag,threads_running,shard/loadavg" commerce

In this example, Online DDL will throttle when:

  • The highest lag value in all shard tablets exceeds the lag threshold (lags default scope is shard), or
  • The number of threads_running on the Primary exceeds its threshold (threads_running's default scope is self), or
  • The highest loadavg value in all shard tablets exceeds its threshold (loadavg's default scope is self, but the assignment explicitly required shard scope).

It's possible to set metrics for the all app. Continuing our example setup, we now:

$ vtctldclient UpdateThrottlerConfig --app-name "all" --app-metrics "lag,custom" commerce

Checks made to the throttler by online-ddl or any multi-named app such as vcopier:d666bbfc_169e_11ef_b0b3_0a43f95f28a3:vreplication:online-ddl, throttle based on lag,threads_running,shard/loadavg, because that's an explicit assignment:

$ vtctldclient CheckThrottler --app-name online-ddl zone1-0000000100  | jq .
{
  "status_code": 200,
  "value": 1.473868,
  "threshold": 5,
  "error": "",
  "message": "",
  "recently_checked": true,
  "metrics": {
    "lag": {
      "name": "lag",
      "status_code": 200,
      "value": 1.473868,
      "threshold": 5,
      "error": "",
      "message": "",
      "scope": "shard"
    },
    "loadavg": {
      "name": "loadavg",
      "status_code": 200,
      "value": 0.00375,
      "threshold": 2.5,
      "error": "",
      "message": "",
      "scope": "shard"
    },
    "threads_running": {
      "name": "threads_running",
      "status_code": 200,
      "value": 2,
      "threshold": 100,
      "error": "",
      "message": "",
      "scope": "self"
    }
  }
}

Checks made by other apps, e.g. vreplication, will now throttle based on lag,custom. vreplication does not have any assigned metrics, and therefore falls under all's assignments.

$ vtctldclient --server localhost:15999 CheckThrottler --app-name vreplication zone1-0000000100  | jq .
{
  "status_code": 429,
  "value": 20.973689,
  "threshold": 5,
  "error": "threshold exceeded",
  "message": "threshold exceeded",
  "recently_checked": true,
  "metrics": {
    "custom": {
      "name": "custom",
      "status_code": 200,
      "value": 0,
      "threshold": 0,
      "error": "",
      "message": "",
      "scope": "self"
    },
    "lag": {
      "name": "lag",
      "status_code": 429,
      "value": 20.973689,
      "threshold": 5,
      "error": "",
      "message": "threshold exceeded",
      "scope": "shard"
    }
  }
}

The assignments are visible in the throttler status:

$ vtctldclient GetThrottlerStatus zone1-0000000101  | jq .app_checked_metrics
{
  "all": "lag,custom",
  "online-ddl": "lag,threads_running,shard/loadavg"
}

To deassign metrics from an app, supply an empty value like so:

$ vtctldclient UpdateThrottlerConfig --app-name "all" --app-metrics "" commerce

The special app vitess is internally assigned all known metrics, at all times.

App rules

This PR has no changes to app rules logic

The user may impose additional throttling rules on any given app. A rule is limited by a duration (after which the rule expires and removed), and can:

  • Further rejecting checks based on a rejection ratio (0.0 for no extra rejection .. 1.0 for complete rejection) before even checking actual metrics/thresholds. This effectively "slows down" the app.
  • Or, completely exempt the app: the throttler will always allow the app to proceed irrespective of metric values or assigned app metrics.

Examples:

Throttle vreplication app, so that 80% of its checks are denied before even consulting actual metrics. The rule auto-expires after 30 minutes. Note: the rest of 20% checks still need to comply with actual metrics/thresholds.

$ vtctldclient UpdateThrottlerConfig --throttle-app "vreplication" --throttle-app-ratio "0.8" --throttle-app-duration "30m" commerce

Exempt vreplication from being throttled, even if metrics exceed their thresholds (e.g. even if lag is high). Expire after 1 hour:

$ vtctldclient UpdateThrottlerConfig --throttle-app "vreplication" --throttle-app-duration "1h" --throttle-app-exempt commerce

The all app is accepted, and applies to all apps that do not otherwise have a specific rule. Examples:

$ vtctldclient UpdateThrottlerConfig --throttle-app "all" --throttle-app-ratio "0.25" --throttle-app-duration "1h" commerce
$ vtctldclient UpdateThrottlerConfig --throttle-app "online-ddl" --throttle-app-ratio "0.80" --throttle-app-duration "1h" commerce

In the above we push back 25% of checks for all apps, irrespective of actual metrics, except for online-ddl checks, where we reject 80% of its checks.

$ vtctldclient UpdateThrottlerConfig --throttle-app "all" --throttle-app-ratio "0.8" --throttle-app-duration "1h" commerce
$ vtctldclient UpdateThrottlerConfig --throttle-app "vreplication" --throttle-app-duration "1h" --throttle-app-exempt commerce

In the above we push back 80% of checks from all apps, except for vreplication which is completely exempted.

It is possible to expire (remove the rule) early via:

$ vtctldclient UpdateThrottlerConfig --unthrottle-app "vreplication" commerce

Commands and flags

These are the vtctldclient commands to control or query the tablet throttler:

UpdateThrottlerConfig

Enable or disable the throttler:

$ vtctldclient UpdateThrottlerConfig --enable commerce
$ vtctldclient UpdateThrottlerConfig --disable commerce

Set a metric threshold:

$ vtctldclient UpdateThrottlerConfig --metric-name "loadavg" --threshold "2.5" commerce

Clear a metric threshold (return to "factory defaults"):

$ vtctldclient UpdateThrottlerConfig --metric-name "loadavg" --threshold "0" commerce

Pre multi-metrics compliant, set the "default" threshold (applies to replication lag if custom query is undefined):

$ vtctldclient UpdateThrottlerConfig --threshold "10.0" commerce

Set a custom query:

$ vtctldclient UpdateThrottlerConfig --custom-query "show global status like 'Threads_connected'" commerce

This applies to the custom metric. In pre multi-metric throttlers, checks are validated against the custom value. In multi-metric throttlers, lag and custom are distinct metrics, and the user may assign different apps to different metrics as described in this doc.

Clear the custom query:

$ vtctldclient UpdateThrottlerConfig --custom-query "" commerce

Assign metrics to an app, use default metric scopes:

$ vtctldclient UpdateThrottlerConfig --app-name "online-ddl" --app-metrics "lag,threads_running" commerce

Assign metrics to an app, use explicit metric scopes:

$ vtctldclient UpdateThrottlerConfig --app-name "online-ddl" --app-metrics "lag,shard/threads_running" commerce

Remove assignment from app:

$ vtctldclient UpdateThrottlerConfig --app-name "online-ddl" --app-metrics "" commerce

Assign metrics to all apps, except for those which have an explicit assignment:

$ vtctldclient UpdateThrottlerConfig --app-name "all" --app-metrics "lag,shard/loadavg" commerce

Throttle an app:

$ vtctldclient UpdateThrottlerConfig --throttle-app "online-ddl" --throttle-app-ratio "0.80" --throttle-app-duration "1h" commerce

Unthrottle an app (expire early):

$ vtctldclient UpdateThrottlerConfig --unthrottle-app "online-ddl" commerce

Exempt an app:

$ vtctldclient UpdateThrottlerConfig --throttle-app "vreplication" --throttle-app-duration "1h" --throttle-app-exempt commerce

Unexempting an app is done by removing the rule:

$ vtctldclient UpdateThrottlerConfig --unthrottle-app "vreplication" commerce

Throttle all apps except those that already have a specific rule:

$ vtctldclient UpdateThrottlerConfig --throttle-app "all" --throttle-app-ratio=0.25 --throttle-app-duration "1h" commerce

CheckThrottler

Issue a check on a tablet's throttler, optionally identify as some app. Use in automation or in troubleshooting.

Get the response is for a vreplication app check:

$ vtctldclient CheckThrottler --app-name "vreplication" zone1-0000000101

Normal checks do not renew heartbeat lease. Override to renew heartbeat lease:

$ vtctldclient CheckThrottler --app-name "vreplication" --requests-heartbeats zone1-0000000101

Check as vitess app:

$ vtctldclient CheckThrottler zone1-0000000101

Force a specific scope, overriding metric defaults or assigned metric scopes:

$ vtctldclient CheckThrottler --app-name "online-ddl" --scope "shard" zone1-0000000101

GetThrottlerStatus

See the state of the throttler, including what the throttles perceives to be current metric values, metrics health, metric thresholds, assigned metrics, app rules, and more.

$ vtctldclient GetThrottlerStatus zone1-0000000101

End of docs.

Related Issue(s)

https://github.com/vitessio/vitess/issues/15624

Checklist

  • [ ] "Backport to:" labels have been added if this change should be back-ported to release branches
  • [ ] If this change is to be back-ported to previous releases, a justification is included in the PR description
  • [ ] Tests were added or are not required
  • [ ] Did the new or modified tests pass consistently locally and on CI?
  • [ ] Documentation was added or is not required

Deployment Notes

shlomi-noach avatar May 21 '24 12:05 shlomi-noach